HomeArticle

"Banana Revolution" Revealed for the First Time: Google's Crazy Engineers Go All Out on Text Rendering and Accidentally Create the Most Powerful Model

新智元2025-08-29 15:52
Google's Nano Banana model: Multi-image fusion 3D conversion, empowered by Gemini to redefine the boundaries of creativity

[Introduction] Google's latest image model, nano banana, has emerged. It can not only combine multiple images to create new pictures but also understand geography, architecture, and physical structures. It can even transform 2D maps into 3D landscapes. With Gemini's world knowledge and interleaved generation technology, the model achieves "memory - enabled" multi - round creation, bringing high consistency and creativity. Nano banana is reshaping the boundaries of AI image generation and sparking endless reveries about the future of "AI creative partners".

What the heck (°ロ°)! Why has the AI circle suddenly started the "Nano Banana Revolution"?

Google didn't expect that releasing a new image model would directly set off a storm in the community!

Recently, this "banana" has become so popular that it seems like we're back to the "Studio Ghibli craze" of OpenAI a few months ago.

This Superman cosplay picture is generated by nano banana. It's so awesome!

But this time, Google's nano banana brings more disruptive gameplay. Unlike the Studio Ghibli style which only has one generation style, Google probably didn't expect that netizens' innovation ability is so amazing.

For example, you can upload up to 13 pictures and then let nano banana combine them.

Can you believe that the above picture is combined by AI using the following "parts"?

According to Google, this nano banana is not just an image model but also has the powerful world knowledge of Gemini.

This takes nano banana's understanding ability to a new dimension (There is an exclusive interview with the Google team later in the article, revealing the latest technical route behind the model).

Since it can combine objects in the physical world, can it also "combine" human actions?

Isn't this just a perfect storyboard? Then netizens continued to use Conch AI to make the following short film.

It seems that it's not impossible to make movies with AI!

Since nano banana has Gemini's world knowledge, you just need to upload a screenshot of the real world, and it can label the content for you.

For example, label the Tokyo Tower in the picture.

You can also label more buildings.

Even using the robot's perspective to outline human figures, isn't this the perspective of the Terminator? It gives a cyberpunk vibe!

The most amazing thing is that nano banana can "see" the "3D world" from a "2D map".

Netizens really like to use Nano Banana to transform Google Maps with the question "What does the red arrow see?".

For example, the Golden Gate Bridge seen from the west.

Or the Tokyo Tower seen from the east.

Even more amazingly, Nano Banana seems to really understand the knowledge of contour lines in geography and can directly draw real geographical landforms from contour lines.

It can even easily handle the engineering drawing perspectives that used to give us headaches.

It can render any picture into top, bottom, left, right, front, and back views.

You can even use nano banana to customize trying on clothes for yourself, and any element can be "worn" on the body.

You don't even need to wear clothes, and it can directly replicate the actions.

On X, netizen @ZHO_ZHO_ZHO used a portrait + action framework to directly achieve a studio - level shooting effect.

Vice versa, it can extract the physical structure of real - world buildings from images.

Even, it can do "reverse" photo - editing. First, change the original picture into a black - and - white wireframe, then choose your favorite colors, and finally re - color the picture.

Nano banana is very accurate in converting line - drawings and coloring.

Of course, wild ideas and pranks are never absent.

For example, make Ultraman play pommel horse while wearing clothes.

Besides creating "new" pictures, nano banana can also restore "old" photos.

It can repair damages and creases and restore the clear pictures that have been erased by time.