HomeArticle

Apple's AI stunned the industry late at night: Siri is finally no longer "intellectually disabled", and Gemini's "core replacement surgery" was a success

雷科技2026-06-09 08:55
Chinese mainland users still have no access to Apple Intelligence.

The long-awaited keynote speech of WWDC26 has finally ended. Lei Technology watched the entire press conference. First, the key point: The new AI features (Apple Intelligence, Siri AI) to be mentioned next have nothing to do with users in mainland China and Apple devices sold in the Chinese mainland at this stage. This AI update is still being rolled out "around the Chinese mainland and around the EU."

Image source: Apple

However, if you are using an overseas version of the phone with an overseas Apple ID, this Apple Intelligence update can be described as "generous and satisfying." It not only enables Apple Intelligence to "catch up" with the most aggressive AI Agent phones in the current industry in terms of functionality, but also, based on Apple's full-platform ecosystem capabilities and pursuit of privacy protection, it has launched a "combination punch" with Apple's characteristics. For users of mainland China versions, even if they can't use it, they can have a taste of "quenching thirst by looking at plums." What if Apple Intelligence really arrives as promised in the cycle of "expectation - disappointment - expectation"?

"Replacing the heart" with Google Gemini, Apple also has its own tricks

Let's first talk about the brand - new Apple Intelligence model. Just like the previous high - profile "teaser," the new - generation Apple Foundation Model (hereinafter referred to as AFM) is built based on the Gemini foundation model (it's not yet certain which generation of the foundation model), rather than directly using Gemini.

If you can't understand this relationship, Lei Technology will give you an analogy:

Restaurant A's dishes taste terrible. It thinks the dishes in Restaurant G are quite good. So it buys a complete recipe (Gemini foundation model) from Restaurant G and modifies it according to its own understanding to create its own recipe (AFM).

But this recipe is a one - time purchase. If Restaurant G adjusts its recipe later, it has no obligation to provide subsequent updates and support to Restaurant A. It all depends on Restaurant A to figure it out on its own.

In Apple's hands, this foundation model, after introducing Gemini technology and "privatizing" it, has also split into two modes (branches) — It can run directly on devices such as iPhones, iPads, and Macs, or on Apple's private cloud computing servers. In other words, the new Apple Intelligence still follows the hybrid model solution, but no longer relies on OpenAI's API for everything like when it was connected to ChatGPT before.

Image source: Apple

Obviously, this combination of the device - side large model and the private cloud computing model can maximize the guarantee that user data stays within Apple's controllable scope, thus protecting users' personal privacy. According to Apple, it cannot access user data, and the data is only used to respond to user requests.

In addition, Apple has also launched a more powerful second - generation "device - side model" (not the AFM mentioned above), bringing better multimodal capabilities.

Image source: Apple

Correspondingly, iOS, iPadOS, and macOS also provide more comprehensive system - level support for the new Apple Intelligence, allowing Apple Intelligence to penetrate into every aspect of the "product suite."

And the first major offering brought to users by all these new technologies is the brand - new Siri.

After being driven by AI, Siri finally bids farewell to "voice stupidity"

In terms of naming, this new Siri driven by Apple Intelligence is quite uncreative and is simply called "Siri AI." However, in terms of interaction methods, understanding ability, etc., the improvement of Siri AI is quite significant.

First of all, Apple has finally prepared an independent app for Siri AI, just like ChatGPT, Gemini, and Grok App. It allows users to view the complete interaction records of Siri AI through a unified entrance.

Image source: Apple

In addition, the "Dynamic Island Siri" interface that was previously exposed online was also confirmed at WWDC26. In addition to long - pressing the side button and saying "Hey Siri," iPhone users can also activate Siri by swiping down from the top.

Image source: Apple

In terms of capabilities, the new Siri AI also has the abilities of "perception," "understanding," "invocation," and "execution."

Let's first talk about "perception." Thanks to the improvement of AFM's multimodal capabilities, the new Siri AI can now not only "hear" what users say but also "see" the content captured by the camera and displayed on the screen. This "perception" is not limited to text; even pictures can be input. Of course, Siri AI also doesn't lag behind in voice perception ability.

Image source: Apple

Apple did not disclose the specific means by which Siri AI perceives the screen content during the WWDC keynote speech. Lei Technology is not sure whether Siri is like Android phones, which are based on screen screenshots and recorded GUI Agents, or whether Apple has leveraged its advantages as a first - party developer to provide a new API for Siri AI.

After perceiving the screen content, Siri AI can understand it and respond accordingly. For example, it can find the shooting location of a travel photo, calculate how much each person should pay based on a restaurant receipt, or estimate the nutritional information of food.

As for execution, Apple has indeed fully utilized its "home - court advantage": Siri AI can directly use the perceived information to create complex tasks, such as directly generating a three - day - two - night travel plan to the shooting location. It can also directly call multiple system apps to perform related operations (the support status of third - party apps is unknown).

On macOS, Siri AI has also unlocked more comprehensive capabilities. It can directly compare, summarize, and modify multiple documents, looking just like an AI Agent client.

Image source: Apple

Interestingly, following the previous "optional voices," Apple has also prepared a new round of custom voice functions for Siri AI: Users can directly customize the voice, tone, and speaking speed of Siri AI by "dragging the progress bar" to create a "unique" Siri AI.

Image source: Apple

In terms of supported languages, Siri AI currently supports English and will open up support for multiple languages other than Chinese (Simplified and Traditional) in the future. But as we mentioned at the beginning, this Siri AI update is still being rolled out "around the Chinese mainland and around the EU," and users of mainland China versions still have to "wait for notice."

It's not just Siri; Safari and Shortcuts are also powered by AI

As a "groundbreaking" work of Apple Intelligence, Apple has also integrated the four major AI capabilities ("perception," "understanding," "invocation," and "execution") into every aspect of other apps.

For example, the new Safari, after integrating Apple Intelligence, can use its capabilities to perform "intelligent grouping" of open tabs. This is extremely useful for Lei Technology editors who always keep hundreds of tabs open to search for information.

Based on Apple Intelligence's multimodal perception capabilities, Safari with AI capabilities can now also intelligently monitor a certain web page in the background and send notifications to users when the web page content is updated. This function is very useful for users who need to "grab tickets on the web."

This is not all. We know that Safari can install third - party plugins, and the new Apple Intelligence provides users with the function of "creating their own browser plugins": Users just need to tell Safari in natural language "what kind of plugin they need and what functions it should have." Safari can then use AI to write a plugin that meets your needs.

Image source: Apple

Yes, Safari has also joined the "Vibe Coding" trend.

The new "Shortcuts" also supports "Vibe Coding." In the past, to create a shortcut, we needed to select trigger components and execution components from an "endless" list and then write a complete running logic with the rigor of writing code. Even Xiaolei has to admit that this complex and high - threshold process has discouraged many users and "wasted" this useful app.

However, with Shortcuts integrating Apple Intelligence, we just need to state our actual needs in natural language. For example, "When I have an out - of - town schedule in 5 minutes and there is no flight or train ticket information in the itinerary, automatically turn on the car air - conditioner." Shortcuts will automatically search for the execution capabilities of various apps (including third - party apps) within its capabilities and automatically generate an execution file, which we can then use directly.

Image source: Apple

The "Apple Home" smart home system, which is not widely used in China, has also been updated: For users who have installed HomeKit cameras and enabled related services, the Home app can analyze the monitoring footage in the background and summarize it into text; it can also directly find the corresponding video clips according to user inquiries.

Image source: Apple

The previously so - called "useless" Photo Booth has also been comprehensively upgraded. It has not only added the ability to generate images from natural language (the old version required users to select a style from fixed options) but also can adjust the style and picture content according to user requirements.

Apple Intelligence's powerful image capabilities have also brought stronger AI image - editing capabilities. In addition to the common object removal and AI image expansion, Apple has added the "spatial composition" ability to the Photos app. It can first expand different photos into spatial photos with depth information and then reframe them, adding a "Z" axis to the traditional two - dimensional reframing (cropping).

Image source: Apple

Judging from the WWDC keynote speech, this function is similar to the existing "spatial wallpaper" conversion on iPhones, but the level of refinement is significantly higher. In addition, Apple also mentioned that this "spatial wallpaper" will first use the device - side model to achieve low - latency real - time operations and then call the private cloud computing for complete rendering.

"Replacing the heart" is successful, and Apple