In - depth review of Google Gemma 4: The most powerful edge - side model is not perfect, but it's well - suited for mobile phones
Recently, Google released the new - generation open - source model Gemma 4, which includes four specifications: E2B, E4B, 26B, and 31B. Among them, the two "small models", E2B and E4B, can be directly deployed and run offline on end - side devices such as smartphones and Raspberry Pi.
As soon as Google's two "small models" of Gemma 4 were launched, they were hailed by many as the best end - side models to date. Lei Technology (ID: leitech) also published two practical test contents successively: one focused on logical reasoning and multimodal capabilities, and the other focused on the experience on domestic thousand - yuan mobile phones.
After using it for some time, the editorial team of Lei Technology (ID: leitech) also had more new feelings and experiences.
Image source: Shot by Lei Technology
The end - side model is 100 times more useful than an encyclopedia
Recently, Apple announced that John Ternus, the senior vice - president in charge of hardware engineering, will succeed Tim Cook as the company's CEO. Subsequently, there were numerous interpretive articles at home and abroad asking "Why did Cook choose him as his successor?" So, if we pose this question to Gemma 4 E4B, what kind of interpretation can it give?
After entering the corresponding question in the chat box, Google's end - side model indeed had almost "zero latency" and immediately started outputting information. Just this experience setting is really eye - catching. (Note: The experience device is iPhone 17 Pro Max, the same below)
Image source: Lei Technology
However, since the amount of output text was not small, it took 46 seconds for Google's end - side model to give a complete answer.
Image source: Lei Technology
At first glance, it can already answer the questions of quite a few people quite well, and this is the core advantage of the end - side model:
Under the condition of the lowest hardware cost (local operation + 0 Token consumption), it can give a "relatively good" answer or a "sufficient" solution.
This year, there was a popular domestic drama called "Peaceful Years". There was a lot of discussion and content related to it. Some time ago, a question was also posed to Google's end - side model:
How could the Wuyue Kingdom maintain peace and prosperity for more than 80 years under the heavy - tax policy?
This is a relatively professional and detailed question. Many people with a university degree (non - history major) may not understand it clearly. Let's see the level of the E4B model:
Image source: Lei Technology
It can be seen that the end - side model is not only an offline encyclopedia but also can answer questions more specifically according to different questions and directions from users, including consulting on professional issues in various fields.
The knowledge cut - off time of Google's Gemma 4 E4B model is October 2023. In theory, you can ask it about all recorded and public events, scientific discoveries, historical information, and cultural knowledge that occurred before this time.
Lei Technology (ID: leitech) believes that this is also a very useful application scenario for the end - side model as a tool, especially for users who are interested in and curious about various information and knowledge at home and abroad, both ancient and modern.
After a preliminary experience of this app (Google AI Edge Gallery), the editor of Lei Technology (ID: leitech) put it in the Dock bar on the home screen of the mobile phone because it is used almost every day.
It is worth mentioning that Google said that although the core training data of Gemma 4 has a knowledge cut - off point, its system will be continuously updated and fine - tuned to improve the model's understanding and answering ability.
The end - side model often has problems when dealing with simple questions
It was thought that the end - side AI model could fully handle basic knowledge fields, but the reality gave a heavy blow.
The Gemma 4 E4B model even gave wrong information about the full text and the author of the famous Tang poem "Invitation to Wine".
Image source: Lei Technology
The reason is simple. The overall number of parameters of the end - side model is relatively small. Even as powerful as Google's Gemma 4, it still cannot cover all knowledge fields. Therefore, there will be "distortion" and "hallucination" phenomena in the detailed information of many fields.
For ancient poems, ancient books, or material information of this kind, instead of asking the end - side model for the corresponding original text information, it is better to directly give the original text information to it, such as ancient poems or classical Chinese, and then let it give the translation or interpretation content.
Due to the problem of a small amount of knowledge base information caused by the small number of parameters of the end - side model, Google also tried to introduce the "agent" ability to the end - side model for the first time.
However, for information retrieval, currently, it can only be connected to online encyclopedia websites (such as Wikipedia), and there are no downloadable offline knowledge base resources as "increments".
Image source: Lei Technology
In addition to regular knowledge information Q&A, end - side AI models represented by Gemma 4 E2B/E4B are also making efforts in work assistance and task - performing scenarios.
At the tool application level, it was thought that tasks such as checking basic grammar errors in articles could be completely handed over to the end - side model for assistance, but the actual performance is also not reassuring, especially when checking grammar errors in long paragraphs of text.
The reason is that high - precision tasks such as checking grammar errors require a large amount of editorial corpus and strong language distribution memory. The end - side model often turns grammar error checking into text modification (polishing) or confuses the difference between the two because it is easier for it to give text polishing and modification suggestions.
It is worth noting that when you send the instruction of "checking and correcting basic grammar errors" to the end - side model, it may be difficult for it to "fully understand". However, if you change it to the instruction of "check basic grammar errors (do not modify if there are no errors)", the output result of the end - side model will be much clearer.
Image source: Lei Technology
Google's Gemma 4 has control capabilities such as system role and function calling, but the premise is that you should write the prompt template, task boundaries, output format, etc. as simply and clearly as possible.
In addition, through practical tests, although Gemma 4 natively supports more than 140 languages, it supports English better than Chinese in complex and delicate tasks such as checking grammar errors in long texts. This may be because its pre - training corpus is still mainly in English.
Is the end - side model more suitable for dedicated scenarios?
In addition to the above - listed situations, Lei Technology (ID: leitech) has previously experienced the native multimodal (image, audio, and video) capabilities of the Gemma 4 E4B model. It can directly recognize objects in pictures, understand simple audio information, and understand simple video information.
In an offline or poor - network environment, by sending a picture from the photo album, Google's end - side model can give basic information about the image.
For example, in a flight scenario, if you have a "simple" need for interpretive information about a picture in an in - flight magazine or newspaper, you can directly send it to the end - side model and let it try to answer.
As for more complex image and audio information, the current end - side model still has difficulty understanding "more" information.
Image source: Lei Technology
So, what are the skills that the end - side model is best at currently?
Undoubtedly, they are these: offline translation, calculator, simple problem - solving and test training tools, as well as basic information popularization and consultation in relatively professional fields (including health and other fields).
Previously, Google built a dedicated translation model, TranslateGemma, based on Gemma 3. Thanks to the special training process, the performance of the TranslateGemma 4B model can be comparable to that of the larger - scale Gemma 3 12B benchmark model. It can be expected that Google will soon launch a new - generation dedicated translation model based on Gemma 4.
Comparison of translation effects between Google's end - side model and online translation tools (Image source: Lei Technology)
Coincidentally, Tencent Hunyuan also recently open - sourced the mobile - end offline translation model Hy - MT1.5 - 1.8B - 1.25bit, compressing the large - scale translation model supporting 33 languages to 440MB. After users download it for free, they can run it directly on their mobile phones without an Internet connection. The official says its translation effect "rivals" that of commercial translation models.
Gemma 4: The "imperfect" first step of the end - side model
In recent months, the cloud - based large models of various companies have been iterating rapidly, and the competition in the number of parameters and intelligence has entered a new stage. In contrast, the end - side model, which is not a new concept, is also working hard to achieve real results as soon as possible.
After experiencing it for some time, the biggest feeling of Lei Technology (ID: leitech) is that the launch of Google's Gemma 4 marks the "imperfect" first step of the end - side model in being deployed on mobile terminal devices.
As for the end - side model at the current level of ability, there are two major recommended user groups:
1. "Encyclopedia - oriented" users who need to query a large amount of information at home and abroad, both ancient and modern, every day. The current end - side model can give you an "initial version" answer more quickly, directly, and specifically in some fields.
2. "Tool - oriented" users who have installed a large number of offline apps on their mobile phones. The current end - side model can perform well in tool application fields such as translation, calculator, simple problem - solving and test training, as well as basic information popularization and consultation in relatively professional fields.
Of course, if you want to try something new or witness the growth of the end - side model, you can also download and experience it.
For iPhone users, even if Apple launches its own end - side model product in the future, it will probably only reach the level that Google's Gemma end - side model can achieve later. The "incremental" or "enhanced" skills that can be expected are mainly the "perfect linkage" and "seamless access" of the end - side model to various operation instructions on the mobile phone.
Image source: Google
It should be pointed out that the answering and response speed of Google's Gemma 4 end - side model is closely related to the running memory and computing power level of your mobile phone.
For iPhone users, it is recommended to have a running memory of at least 8GB, and 12GB is recommended; for Android users, it is recommended to have a running memory of at least 12GB, and 16GB is recommended. With such a