MegaPortraits: High-Res Deepfakes Created From a Single Photo

Researchers from Samsung Labs have developed a way to create high-resolution avatars, or deepfakes, from a single still frame photo or even a painting.

The team calls it MegaPortraits and claims they are able to create avatars up to megapixel size from a single frame. According to the team, their breakthroughs tackled the problem that the driver’s image may look different than the animation source. For example, it is much harder to operate an effective deepfake of Angelina Jolie when the subject is a short-haired male.

HTML2_ The reason deepfakes featuring Tom Cruise were so successful was because they had an identical appearance to Cruise and could replicate his mannerisms. If the model was very different, it would be much more difficult to make a deepfake of Cruise.

This method appears to solve that problem and allows anyone to control realistic avatars even if they do not closely resemble the target.

“We propose a set of new neural architectures and training methods that can leverage both medium-resolution video data and high-resolution image data to achieve the desired levels of rendered image quality and generalization to novel views and motion,” the researchers state in their abstract.

“We demonstrate that the suggested architectures and methods can produce convincing, high-resolution neural avatars.

The system to create high-resolution avatars of humans is known as MegaPortraits or megapixel portraits. The researchers explain that the model is trained in two stages, but they have also proposed a third additional stage that allows it to work faster.

Our training system is fairly standard. We sample two random frames from our dataset at each step: the source frame and the driver frame. The team explained that the model overlays the driver frame’s motion (i.e. the facial expression and head position) on the output frame in order to create an image.

The main learning signal comes from training episodes in which the driver and source frames are from the same video. Therefore, our model is trained to predict the driver frame .”

Perhaps even more remarkable, this team managed to compress the system into a model that runs in real time.

“We show how a trained high-resolution neural avatar model can be distilled into a lightweight student model which runs in real-time and locks the identities of neural avatars to several dozens of pre-defined source images,” the researchers continue. The program’s effectiveness is demonstrated by the multiple above results. The full project titled MegaPortraits: One-shot Megapixel Neural Head Avatars has been published as a scientific paper.