This AI tool creates singing, rapping, talking avatars from a single image and even the Mona Lisa isn’t safe from spitting bars

This AI tool creates singing, rapping, talking avatars from a single image and even the Mona Lisa isn't safe from spitting bars



Remember the late-night talk show where they show a picture of a political figure with another person's mouth over it to get them to say dubious things? It always looked a little tough, but that was part of the effect. Well, this new AI tool also takes still images of human subjects and animates their mouth and head movements, but this time the effect is surprisingly, almost worryingly convincing.

The tool is called EMO: Emote Portrait Alive, and it was developed by several researchers at the Institute for Intelligent Computing, part of the Alibaba Group. The tool takes a single reference image, extracts generated motion images, and then combines them with vocal audio through a complex diffusion process in which the facial area is integrated with noise samples from multiple images and then denoised while adding generated images to synchronize the audio finally becoming a video of the subject that not only lip-syncs but also highlights different facial expressions and head poses.





Source link

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *