Synthesizing Human ‘Performance’

I became aware of Synthesia.io when researching my articles on Amplified Creativity. More accurately when I was updating my article just before publication three months later, and I had to rewrite my “three years out” prediction for something like the Synthesia service, because it became a reality within that three month window! Yesterday I became aware of another synthetic avatar (AI) tool that has a very different business model, and it’s avatars appear to be a step ahead of Synthesia, although it may be unfair to compare.

I’m using “performance” in the broadest meaning rather than imply the performance of a talented actor in an emotional role. All presenters perform something, whether they’re the face of a company at a trade show, or the regular presenter on Spectrum’s commercials you’re performing someone’s script. I would also opine they are performing the script with as much understanding as Synthesia’s avatars do!

As impressive as Synthesia is, when you become familiar with it, there are plans where it’s not perfect. Like all synthesized speech, it sometimes takes a little tweak to get the pronunciation right, but there are moments where it veers into uncanny valley territory. For the anticipated uses – social media, corporate communication and training – then it’s a very useful tool that will facilitate more video production because the “talent shoot” is almost instant.

Thanks to some quite good support, I’ve learnt that Synthesia’s current Avatars are generation one, and a new generation is expected during 2022. If there version two Avatars are like Rephrase.ai’s then it will be a big step up.

Rephrase’s business model is very different, and in many ways that makes it’s Avatar creation a more focused solution, but if you look at the examples, the speech match and more natural body movement render an even more believable fake.

Perhaps it’s unfair to compare. Synthesia have both generic (and custom) Avatars intended to be available for a range of professional presentations across many languages. Rephrase take a performance and provide more limited customization, which is almost certainly a more achievable goal than a full generic performance generator, but all are indicators of the directions we’re going.

Rephrase’s B2B model is another point of departure from Synthesia’s B2B model. Rephrase probably know the names of all their customers!

I fear that careers as on camera presenters for the corporate and educational markets may be restricted. On the other hand, our use of Synthesia presenters has resulted in much more marketing and education content being produced. Very occasionally someone notices our presenters are “a little off” but it’s rare anyone notices.

Send a Message