Following Meta’s video-to-text generator, Google now has its artificially intelligent movie maker.
Goggle’s Imagen Video is still in its development phase, but the company says it will be capable of producing 1280×768 videos at 24 frames per second from a written prompt.
According to Google’s research paper, Imagen Video will have stylistic abilities, such as generating videos based on the work of famous artists like Vincent van Gough. It will also generate 3D rotating objects while preserving their structure and rendering text in various animation styles.
Google hopes its AI-video model will “significantly reduce the difficulty in high-quality content creation.” It can also generate [*********************************************]D rotating objects while preserving their structure and rendering text in various animation styles.
As described by Google’s research teach, Imagen Video will take a text description and generate a 16-frame, three-frames-per-second video at 24×48 pixel resolution. The system then upscales and “predicts” additional frames, producing a final 128-frame, 24-frames-per-second video at 720p.
Google says that Imagen Video has been trained on 14 million video-text pairs and 60 million image-text pairs as well as the LAION image-text dataset which was used to train Stable Diffusion.
Among the examples provided by Google, is a panda chewing on bamboo, a zooming shot into a choppy sea filled with pirate ships, and an astronaut riding a horse.
It is worth noting that all the results from Imagen Video are picked by Google themselves and as of yet no independent testers have tried the program.
That said, the research paper claims that Imagen Video can render text properly, something that DALL-E and Stable Diffusion both struggle with. The text that those programs generate is barely readable.
It also claims that Imagen Video has demonstrated an understanding of depth and three-dimensionality, allowing drone flythrough videos to be created that rotate around and capture objects from different angles without distortion.
Google has voiced its concerns over “problematic data” used to train its AI-image generator programs. Google has tried to remove explicit and violent sexual content as well as cultural stereotypes. It is concerned that the tool may be used “to generate, fake, hateful, explicit, or harmful content.”
“We have decided not to release the Imagen Video model or its source code until these concerns are mitigated,” adds Google.