After the coffee machine became smart, I couldn't even have a cup of coffee.
So far, no one has truly solved the problem of how to make LLMs know when to be precise and when to be random.
It was really a morning that made one feel utterly defeated.
A tech journalist from The Verge got up, walked into the kitchen, and said to the Bosch coffee maker that supports Alexa, "Make a cup of coffee."
Without any improvisation or complex requests, she just hoped that the machine would simply execute a pre - set program. However, she was rejected.
And not just once.
Since the upgrade to Alexa Plus (Amazon's generative AI voice assistant), such conversations have almost become a regular part of her morning routine.
Every time she asks it to make coffee, Alexa can come up with different reasons, using amazing creativity to tell her no.
It's almost the end of 2025. AI can write papers, write code, chat with people, and teach, but it fails at the simple request of "Make a cup of coffee" in the morning.
In community discussions, similar complaints are extremely common, and people are full of grievances.
Turning on the light has become a major problem area.
Playing songs is also difficult.
Setting a timer has also become so difficult.
Some people are completely disheartened.
Obviously, there is a sharp contrast between the reality and people's intuitive expectations of AI.
Traditional assistants may be stupid, but they are highly predictable. As long as you say the (a bit silly) "magic words" correctly, the result is always predictable.
Now, the generative AI assistants centered around LLMs have higher intelligence, deeper understanding, and more diverse expressions. However, they often fail at the things they were originally best at:
Turning on the light, setting a timer, reporting the weather, playing music, and running routines.
Why is this the case?
Because LLMs inherently introduce a large amount of randomness. They can understand more meanings and allow more free - form expressions, but the cost is that the interpretation space is greatly expanded, including the possibility of misunderstandings.
If you ask ChatGPT the same question today and tomorrow, you may get different answers. This is exactly where its value lies. However, when this characteristic is used to control a coffee maker, there is a problem.
Discussing probability in control scenarios that require instant, repeatable, and zero - tolerance operations is itself a major bug.
In contrast, the essence of traditional voice assistants is actually template matchers. They don't understand; they just recognize keywords and then fill in parameters.
For example, if you say "Play the radio," the system clearly knows that what follows can only be the "name of the radio station."
To make up for the lack of certainty in generative models, both Amazon and Google have tried to deeply integrate LLMs with smart home APIs. However, this has introduced new problems.
LLMs are indeed not good at generating completely consistent and grammatically correct system calls for every request.
When they are required to directly generate API calls to control real - world devices, even a tiny deviation can lead to the failure of the entire operation.
This is exactly why your coffee maker sometimes just refuses to make coffee for you.
Theoretically, it is not impossible to make the new assistants as reliable as the old ones, but this requires an extremely large amount of engineering investment, constraint design, and failure fallback mechanisms.
In the real world where resources are limited and the temptation of "doing something more exciting and profitable" is strong, the simplest approach is to first introduce the technology into the real world and then let it gradually correct itself.
In other words, we are all collectively playing the role of long - term beta testers for AI.
So far, no one has truly solved the problem of "how to make LLMs know when to be precise and when to be random." Therefore, people may have to struggle with it and their rising blood pressure for a long time.
So, why are we so determined to abandon the old technology?
Two words: potential.
The so - called Agentic AI enables the system to have the ability of chained service calls. It can understand the internal relationships between complex tasks and dynamically generate execution logic based on this.
This is also the fundamental reason why the old technology route must be abandoned.
In the past, voice systems based on fixed rules and keyword matching were limited to being "single - instruction executors" at the architectural level. They could not understand the goals, break down tasks, and were even less likely to generate new action paths during runtime.
This is not a simple technological upgrade but a switch in the paradigm of capabilities.
Back to the community discussions, although the upgraded voice assistants still make mistakes in the most basic commands, netizens also admit that they are indeed better at understanding complex commands.
For example, if you say, "Dim the light here and raise the temperature a bit," it can adjust both the light and the thermostat at the same time.
When you ask angrily, "Alexa, what on earth are you doing? Why don't you turn off my music?!" it will actually check what's going on.
In the past, these were all unimaginable.
What is most praised is the change in the camera notification function.
Traditional systems often only give a highly generalized and useless message, such as "Motion detected in the backyard." So you have to: open the app → click on the video → review it → and find out that it's just a cat.
Now, the new system will directly tell you, "An unfamiliar face appeared at the door but did not enter the yard."
Setting up complex routines with voice is indeed easier than clicking through multiple levels of settings in the Alexa app, even though these routines may not run very stably.
In a large number of user discussions, a relatively moderate consensus has gradually emerged: the problem is not about introducing AI, but about the "boundaries" and whether to try to use AI to replace everything.
Some users believe that a more reasonable direction is not to "eliminate buttons" - replacing the proven and deterministic execution mechanisms, but to let AI help people understand the system.
The current chaos may not be a failure of generative AI but rather a result of placing it in an inappropriate central position.
However, at least for now, this clear boundary has not been outlined, and it's unclear when it will be.
So, how is your smart home? Have you ever had similar frustrating moments? Welcome to share your thoughts in the comment section.
Reference Links
https://www.theverge.com/tech/845958/ai-smart-home-broken
https://www.reddit.com/r/technology/comments/1pvh1c8/how_ai_broke_the_smart_home_in_2025_the_arrival/
This article is from the WeChat official account "Almost Human" (ID: almosthuman2014). Author: Sia. Republished by 36Kr with permission.