HomeArticle

Thanks for the invitation. I'm drawing figure eights at the intersection. Suddenly, Google Gemini whispers in my ear: "Go towards the red house, you silly goose."

新智元2026-03-30 14:54
Gemini takes over walking and cycling navigation, understands human language and the physical world.

[Introduction] Google Maps' Nuclear Update: Gemini Takes Over Walking and Cycling Navigation! Ask about toilets in cafes? Ask how many available EV charging spots are left? Ask about the vibe of a neighborhood? Gemini instantly understands human language and the physical world. People with a poor sense of direction, stand up! You'll never have to hold your phone and spin around in frustration again!

The most vicious curse in the world is that nonchalant line in navigation: "Walk 500 meters east."

Even if you hold your phone up to the sky and draw figure - eights like a shaman, that damned arrow still just spins in place.

When it comes to "not being able to tell north, south, east, or west", human dignity has been humiliated by GPS for a full 20 years.

But today, this "dumbing - down feeling" has been completely ended.

Google Maps has just dropped a bombshell: Gemini has officially taken over walking and cycling navigation.

From now on, your phone won't just repeat coordinates. Instead, it will whisper in your ear: "Turn right at the Starbucks intersection up ahead. Yes, the red building with the posters on it."

People with a poor sense of direction, stand up!

Don't humiliate me with coordinates. Speak "human language"!

Traditional navigation is based on GPS coordinates, which is a machine language. It doesn't understand what 500 meters means to humans.

After Gemini's intervention, it transforms these machine instructions into semantic understanding.

To implement an extremely simple instruction like "Turn right after the gas station", Gemini cross - references information from 250 million locations around the world and a vast amount of Street View images in the background.

It must accurately identify which buildings are "conspicuous" and which landmarks are "famous", and ensure that this information is visually exclusive in the real world.

To this day, Google has been trying to give maps "context awareness" through Gemini.

In the early update in November 2025, this ability was limited to the driving scenario; now, it has extended to walking and cycling.

You can interrupt it at any time: "Tell me what's fun in the neighborhood I'm in right now?" or "Is there a cafe with a toilet nearby?"

You no longer need to repeatedly confirm that vague arrow. Instead, you can rely on the iconic "blue statue" at the intersection.

Google is transforming Maps from a static direction guide into a real - time, conversational navigation experience.

"Hands - free" Agency. You're here to walk, not to fiddle with the screen

The problem of the brain's understanding has been solved. So, how can we achieve "sensory synergy"?

On January 29th, Gemini officially "stepped off" the car dashboard and entered scenarios like walking and cycling, which have extremely low tolerance for interaction.

If you're riding a bike or carrying two pounds of freshly bought ribs, Gemini's "agency permission" is a lifesaver.

You don't need to stop, take off your gloves, or type on the screen on the street. Just ask directly:

What's that building on the side of the road that looks like an alien spaceship? Also, search for a cafe with a toilet nearby.

This in - depth App linkage turns the map into a dynamic task center.

If you're strolling on a strange street, Gemini can also be a real - time encyclopedia.

You can also ask questions at will: "Which neighborhood am I in?" or "What are the must - see attractions nearby?"

You can also ask complex long - queries for specific survival needs, such as: "Is there an affordable cafe with a toilet on this route?"

This multi - dimensional screening involving details of physical facilities (toilets, parking spaces, price ranges) is a depth of physical data that ordinary AI searches can hardly reach.

Moreover, Gemini supports continuous conversations within the navigation screen.

You can first ask: "Is there a vegetarian restaurant within 2 miles ahead?" After getting the result, you can then ask: "What's the parking situation there?"

Note, this is not just simple voice recognition. This is like welding Maps, Gemini, WeChat, and the calendar together.

This multi - dimensional screening for "physical survival needs" is where AI's real combat power lies.

The Three - Dimensional Teleportation of the "All - Seeing Eye" Google Lens

If landmark navigation solves the problem of "how to get there", then Gemini + Google Lens solves the information gap problems of "where to go" and "what to do after getting there".

The map has evolved from a two - dimensional coordinate plane into a three - dimensional physical world decoder.

In the search bar of Maps, when you click on the camera icon and point it at the building in front of you, Gemini starts to decode the semantics of the physical entity in real - time.

You can ask it like: "Where is this? Why is it famous?" or "What's the atmosphere like here?"

The AI will instantly retrieve the profiles of 250 million locations and combine them with a vast amount of user reviews to give you a warm answer, rather than a cold rating.

Google can even dig out "hidden knowledge".

Through the brand - new Gemini Tips module, you can even know the "hidden menu" of some restaurants, the smartest way to make a reservation, and even the hardest - to - find specific entrance in a large shopping mall before you set off.

These tiny details are almost impossible to achieve through keyword screening in traditional searches.

Electric vehicle owners don't have to struggle to find charging stations anymore. It not only tells you where the charging stations are, but also predicts how many available spots are left when you arrive through historical data and real - time networks.

This information - gap blow makes traditional searches seem like products of the last century.

Dimensional Warfare: Why Can't SearchGPT Win for Now?

Silicon Valley is constantly shouting that SearchGPT or Perplexity will overthrow Google.

But in the "physical world", they can't compete at all.

SearchGPT is an all - knowing "digital ghost". It understands web pages and logic, but it's "blind" on the road.

It doesn't have 20 - year - old global street views, nor does it have real - time control over 250 million merchants.

When you want to know "whether the signboard of that restaurant looks good" or "whether there are steps at that intersection", the AI can't figure it out through logical deduction. It has to "see it with its own eyes".

Google has activated these dormant visual assets through Gemini, giving AI physical semantics. This is a gap that no large - scale model trained purely on text can currently cross.

According to the Local Visibility Index report released by SOCi, when dealing with specific local merchant information (address, business hours, real - time dynamics), the information accuracy rate of ChatGPT is only 68%, while Deep Gemini achieves 100% coverage and precise alignment¹.

In a scenario like navigation with extremely low tolerance for errors, a 32% error rate is enough for users to vote for Google.

Google's ambition goes far beyond maps. Some people speculate that Google is building a full - scenario Agent closed - loop: Chrome is responsible for handling complex tasks in the digital world (booking tickets, price comparison), while Maps is responsible for handling complex tasks in the physical world (leading the way, exploring stores, proxy communication).

The essence of this competition is a competition between "cognition" and "existence".

OpenAI has a more agile brain, but Google has the most substantial physical presence.

In the era of AI agents, only an AI that can truly see and move in the physical world can be called a real agent.

In the future, you may no longer use a map. Instead, you'll "converse" with the city.

Google is using Gemini to stitch up the last crack between the digital world and the physical world.

From Chrome's automated agents to Maps' full - scenario "blind operation", AI is taking over our senses.

Next time you're standing at a strange crossroads, don't just stare at the spinning arrow like a fool. Put on your headphones and directly ask that silicon - based co - pilot:

"Take me to that hidden restaurant that only locals know about. Also, help me check if there are any outdoor seats with a view there now?"

If you haven't felt this sense of anxiety yet, it's recommended that you walk 500 meters at a crossroads and see.

Reference materials:  

https://techcrunch.com/2026/01/29/google-maps-now-lets-you-access-gemini-while-walking-and-cycling/ 

https://techcrunch.com/2025/11/05/google-maps-bakes-in-gemini-to-improve-navigation-and-hands-free-use/ 

This article is from the WeChat official account "New Intelligence Yuan". Editor: Qingqing. Republished by 36Kr with permission.