Google has created a program where the viewer can “fly into” a still photo using artificially intelligent (AI) 3D models.
In a new paper entitled InfiniteNature-Zero, the researchers take a landscape photo and then use AI to “fly” into it like a bird, with clever software generating a fake landscape thanks to machine learning.
When facing the daunting task, researchers had to fill in information that a still photo doesn’t provide, such as hidden areas in a photo. For example, a spot that is hidden behind trees needs to be generated. This can be done by “inpainting,” the AI will simulate what it thinks would be there by the process of machine learning with huge datasets.
The AI must also generate the image beyond the photo’s boundaries to achieve the flying effect. This is called “outpainting” and is much like the content-aware tool in Photoshop where the AI will generate a wider image based upon the original photo and aided by its deep learning from massive datasets.
Anyone who zoomed in on a photograph will be familiar with the way that image quality drops as it becomes blurry. To stop this from happening, Google uses superresolution, a process where AI synthesizes a noisy, pixellated image into a crisp one.
The program, which researchers named “Perpetual View Generation of Natural Scenes from Single Images,” combines these three techniques: inpainting, outpainting, and superresolution, to create the flying effect.
In previous attempts by the researchers, the image breaks down almost immediately as the viewer flies in. The image held up for much longer in this paper by Google Research, Cornell University and UC Berkeley. Although it’s not perfect, the image is a significant improvement over previous attempts.
The latest paper is also a significant step in the fact that prior perpetual view generators had been trained using real drone footage. These new models were created only from single photos of landscapes.
“This AI is so much smarter than the previous one that was published just a year ago,” says Karoly Zsolnai-Feher, from Two Minute Papers.
“And it requires training data that is much easier to produce at the same time. It is amazing to think about what the future holds for us. So cool!”
Google’s AI Research
Google’s AI team has been using the Neural Radiance Fields (NeRF) that previously allowed researchers to build detailed 3D models of real-world locations and powerfully denoise images, effectively enabling user to “see in the dark.”
The previous programs required a lot of images to generate the site, while the perpetual view generator requires only one image.
Earlier this year, PetaPixel reported on Samsung Labs developing a way to create high-resolution avatars, or deepfakes, from a single still frame photo called MegaPortraits.