When everyone in the office is clacking away at their keyboards... typing is turning into an ancient skill.
The keyboard seems to be becoming an antique.
In early February 2025, Andrej Karpathy first proposed and named the concept of "Vibe Coding": developers no longer write code line by line but describe their requirements to AI in natural language and let tools like Claude Code and Codex do the work.
This concept quickly became a popular term in the AI circle and evolved into a more widespread work style by the end of 2025: all knowledge work began to "go with the flow," letting AI turn ideas into outputs. People gave this work style a new name: Vibe Working.
Naturally, Vibe Working requires a smoother input method than the keyboard. Thus, voice input made its debut.
Voice dictation is merging with Vibe Coding: developers dictate requirements while pacing, and voice tools convert the voice into text prompts, which programming AI then turns into code. The speed of thought flow is no longer limited by the speed of fingers.
This integration even led to an unexpected embarrassment: the Mac Mini doesn't have a built - in microphone.
On Chinese platforms like V2EX, Zhihu, and Xiaohongshu, the question "What to do if you buy a Mac Mini for Vibe Coding but find it has no microphone" became a frequently asked one. Some people rummaged through the settings in confusion and couldn't find an input device before realizing the "flaw" of this machine: the Mac Mini (along with the Mac Pro and Mac Studio) has never had a built - in microphone.
Image source: Xiaohongshu @Keerbai
Therefore, users who want to do Vibe Working have to buy a USB microphone or a gooseneck microphone when ordering a Mac Mini. Apple probably didn't anticipate that "no microphone" would become a major drawback of a high - performance desktop computer, rather than just a minor annoyance when a user forgets to bring headphones.
Behind this embarrassment lies a real trend. Voice input is moving from the periphery to the mainstream at a speed beyond everyone's expectations.
Whispers in Silicon Valley offices
According to The Wall Street Journal, Mollie Amkraut Mueller, an AI entrepreneur in Seattle, used to have a sacred evening ritual: put the kids to bed, collapse on the sofa, and she and her husband would each turn on their computers to finish the remaining work of the day in the tranquility of the living room.
This peace was later broken.
It wasn't because of the kids crying but because of Mollie herself: she started whispering to her laptop at night, pausing occasionally, muttering corrections to herself, and then continuing. Her husband endured it for a while and then protested.
Amkraut Mueller is obsessed with a voice dictation app called Wispr Flow. When used in combination with Claude Code and Codex, it can convert rambling stream - of - consciousness voice into coherent and usable text within seconds. It's efficient, yes. But it's also a bit strange.
This strangeness is spreading like a virus in Silicon Valley offices.
In some companies, this trend started with one employee and then quietly spread. Gooseneck microphones began to appear at workstations. More and more people are giving up the keyboard and whispering instructions to their computers instead.
A venture capitalist described that visiting an AI startup today is like walking into a high - end call center. Except everyone is chatting with AI. Engineers at the fintech company Ramp wear gaming headsets and talk loudly to their AI assistants; Edward Kim, the co - founder of the human resources company Gusto, encourages employees to try voice dictation technology and predicts that "future offices will sound more like a sales floor."
Then he led by example: "I've been talking to my computer all the time. I don't type unless I have to."
The Wall Street Journal report titled "Typing Is Being Replaced by Whispering — and It's Way More Annoying" quickly sparked widespread discussion. The author, Kate Clark, wrote: "The work style across Silicon Valley is being reshaped, and once - quiet office spaces are turning into noisy sound nests."
Image source: The Wall Street Journal
The Guardian also followed up the same month, publishing "The End of Typing? Why Workers Are Suddenly Ditching Their Keyboards."
For a while, "voice input" became one of the hottest topics in the tech circle.
How did this whispering revolution happen?
The sound evolution in Silicon Valley offices
Let's do a brief soundscape archeology first.
In 1998, the main sounds in the office were the dial - up beeps of fax machines and the blinking red lights of answering machines. In 2008, it was the clattering of keyboards and the ringing of telephones. By 2018, the notification sounds of Slack took over.
In 2026, when you walk into a Silicon Valley AI startup, you'll hear a chorus of whispers — someone saying "Send an email to Zhang San about tomorrow's meeting," someone else saying "No, cancel, start over," and someone describing the logic of a function to the screen.
Chad Strickland of the NICH studio recorded this change on Substack: "The sound in our studio has changed in the past year. We've always been known for playing carefully selected playlists, with music playing from the moment we enter until the last person leaves. But the whispering started. Now we're very careful about our music selection; it can't have lyrics — so Jackie Gleason's classical jazz has become popular. Why? Because what you mainly hear now are people having one - sided conversations with their laptops. Pauses, half - spoken words, and occasionally a 'No, scratch that'."
Technologically, the key turning point for this change was in 2022: OpenAI released the open - source speech recognition model Whisper, pushing the accuracy of speech - to - text conversion to a new height. Since then, the iteration has accelerated. Whisper Large v3 has a word error rate of about 2.7% in clean audio benchmark tests; the gpt - 4o - transcribe model launched by OpenAI this year even achieved a low error rate of 2.5% in third - party evaluations. Compared with the speech recognition tools with extremely high error rates five years ago, this is a qualitative leap. Since then, the maturity period of AI speech large models has been accelerating.
But the maturity of technology is just one piece of the puzzle. What really ignited this trend is a word: voicepilled.
Last fall, Reid Hoffman, the co - founder of LinkedIn, confessed on LinkedIn: "I am voicepilled." He argued that replacing typing with speaking is the next great leap in the computer field. The so - called "voicepilled" is an epiphany — when you're no longer restricted by that Victorian - era typewriter legacy (i.e., the keyboard), you can have higher productivity and creativity.
Image source: LinkedIn: Reid Hoffman
This word comes from the famous "red and blue pill" metaphor in the movie The Matrix: once you take the red pill and see another world, you can never go back. The term Voicepilling quickly became the new shorthand for ditching the keyboard and spread rapidly because the accuracy of AI voice dictation tools is now high enough to make speaking faster than typing.
An exploding market
The entrepreneurial journey of Wispr Flow is quite dramatic.
The company was founded in 2021 by Tanay Kothari and Sahaj Garg. Their initial goal was not to develop voice input software but to create a non - invasive wearable device with a neural interface, hoping to control computers and smartphones by reading users' neural signals. The team even built a prototype of a Bluetooth headset that Kothari described as "like pure magic." However, due to unmet market demand, the company had to make a difficult transformation, reducing the team size from about 40 to 4 people and shifting the focus to voice dictation products. Eventually, this transformation route gave birth to the widely - noticed Wispr Flow.
This "forced transformation" hit a once - in - a - generation opportunity.
The Mac app of Wispr Flow was launched in the fall of 2024 and has been on a roll ever since: the monthly active users have increased by 50%. Kothari said that almost every top venture capital fund in Silicon Valley is using Wispr Flow to write emails, memos, and documents. VCs themselves have become the most enthusiastic users of this product, and "what VCs use" has never been a trivial matter in Silicon Valley.
Financing followed: in June 2025, Wispr Flow completed a $30 million Series A financing led by Menlo Ventures; in November of the same year, it completed an additional $25 million financing led by Notable Capital, with a valuation of about $700 million. According to a Bloomberg report in May this year, Wispr AI is in talks for a new round of financing of about $260 million, and the valuation is expected to exceed $2 billion.
It took less than three years to go from a four - person team to a $2 billion valuation.
Wispr Flow is not alone. Early entrants Aqua Voice and Willow are both supported by Y Combinator, and then a number of competitors such as TalkTastic, Typeless, and Superwhisper have flooded in. Tech media TechCrunch named 2025 the year when AI voice dictation apps really took off and compiled a list of the best voice dictation tools of the year.
Wispr claims that after three months of use, on average, more than half of each user's characters are input via voice. The company's 12 - month user retention rate is 70%, the user base has increased 100 - fold year - on - year, and the global downloads have exceeded 2.5 million times, and it has penetrated 270 of the Fortune 500 companies.
One detail is worth mentioning: among Wispr Flow users, English input accounts for only about 40%, and the remaining 60% comes from non - English languages — Spanish, French, German, Hindi, Mandarin, etc. For a voice product developed by a "Silicon Valley startup," more than half of the real - world usage scenarios actually occur outside Silicon Valley. This may be the most underestimated aspect of the entire voice input trend.
The list of celebrity endorsements for this product is also quite impressive. Reid Hoffman publicly announced that he is "voicepilled"; Marc Andreessen, the founding partner of a16z, called it "staggeringly good"; Steve Wozniak, the co - founder of Apple, is also a regular user. Rahul Vohra, the CEO of Superhuman, called it "one of the most important consumer AI products since ChatGPT." In Silicon Valley, "what's in VCs' phones" is never just a personal choice — it's the prelude to the next round of financing conversations.
Tech giants have also sensed the trend. In May 2026, Google launched Rambler, an AI voice dictation function powered by Gemini and built into Gboard at the Android Show: I/O Edition 2026. This function can automatically remove filler words, understand users' mid - sentence modifications, and support mixed - language input. It is regarded as an important step for Google to officially enter the AI voice dictation arena. Ben Greenwood, the director of Android core experiences at Google, described it as "reinventing the keyboard."
For startups, this news is a mixed blessing: the entry of a giant is the best proof that the market is validated, but it is also the biggest competitive threat.
Meanwhile, even more "peculiar" usage scenarios are emerging. Allan Guo, the founder of Willow, announced on LinkedIn: "I'm happy to announce that we've removed the keyboard from the world's most prestigious TV awards." — The preparation team for the 2026 Emmy Awards is using Willow's voice dictation tool to handle Slack messages and clear their inboxes.
Image source: LinkedIn: Lawrence Liu & Allan Guo
When gooseneck microphones start to appear on the workstations of high - performance Macs and the Emmy Awards' operations team starts whispering to the screen, this change is no longer just a geeky trick in Silicon Valley.
According to a report by Mordor Intelligence in January this year, the global speech recognition market size is estimated to reach about $22.5 billion in 2026 and is expected to grow to $61.7 billion by 2031, with a compound annual growth rate of about 22.4%. This is just for the "speech recognition" segment, and the market space for the entire voice AI field is much larger.
When even Google starts to build voice dictation into the default keyboard, the direction of this trend is clear.
The situation in the Chinese market heated up earlier than outsiders expected.
Chinese users' habit of voice input actually predates the "voicepilling trend" in Silicon Valley.
This is closely related to the development path of the Chinese input method ecosystem. Compared with many Western users who have long relied on keyboard input, Chinese users started using the voice - to - text function through mobile input methods very early. Some researchers believe that there is a natural synergy between Chinese speech recognition and the pinyin input system, making voice input easier to integrate into daily communication scenarios. At the same time, the input habits accumulated in the mobile Internet era have also provided fertile ground for the popularization of voice interaction.
In this process, third - party input method manufacturers have continuously promoted the upgrade of voice input capabilities.
Leading products such as Sogou, iFlytek, and Baidu have long dominated the market. Among them, iFlytek Input Method has always regarded AI voice technology as its core competitiveness. According to official iFlytek data, its voice input currently supports more than 200 dialects and more than 30 foreign languages and provides offline speech recognition capabilities.
In the past six months, the Chinese voice input market has entered an obvious product upgrade