The success of NotebookLM is something that Google itself also wants to replicate. | Focus Analysis
Written by | Yuan Yingliang
Edited by | Deng Yongyi
In the United States, to a certain extent, the influence of podcasts has already exceeded that of TikTok.
In the recently concluded US presidential election, both Trump and Harris regarded podcasts as an important propaganda battlefield. And Trump's interview podcast with Joe Rogan, the top podcast host in the US, received a phenomenal 48 million views on YouTube, which was regarded as the "last push" for Trump's victory.
The times have already changed. According to Nielsen's statistics, on Tuesday, November 5, the prime time for election viewing, only about 42.3 million viewers watched the coverage of the 2024 US presidential election through 18 television networks.
In fact, whether it is the White House or Silicon Valley, podcasts, as a new medium, are rapidly emerging.
In the past two months, an AI note-taking product developed by Google, NotebookLM, has grown rapidly. SimilarWeb data monitoring shows that the traffic in October increased by more than 200% to 9.2 million visits, which is about twice as high as 3.1 million visits a month ago. It has been frequently shared on social media and has become extremely popular.
The reason for its popularity lies in a small feature of NotebookLM - "Audio Overview": By inputting a long text, an AI can generate a podcast. It can generate a realistic double-person podcast lasting for more than ten minutes for any text material. The voice, intonation, and even the content teasing are full of a "human touch".
△ The "Audio Overview" feature of NotebookLM
Netizens all exclaimed: "Google has finally made a significant innovation this time." After it became popular, Google's old partner Meta couldn't sit still and immediately launched an open-source alternative to NotebookLM, NotebookLlama, based on the Llama model.
The popularity of AI note-taking products has also risen accordingly. On November 15, Tencent officially announced a new AI note-taking product, ima.copilot, which can retrieve WeChat Official Account articles and form its own exclusive knowledge base.
According to TechCrunch, on October 28, the AI note-taking product Read AI raised $50 million in Series B financing. On October 23, Granola completed its $20 million Series A financing.
Accidental Success
In fact, NotebookLM is just a demo with the potential for practical application. Initially, Google did not intend to create a podcast product. Instead, it mainly wanted to summarize, organize, and ask questions about the content based on the source documents to improve productivity, just like what most AI note-taking products aim to achieve.
However, the function of AI-generated podcasts has become the "golden finger".
△ The traffic curve of NotebookLM in recent months
This is similar to what happened with ChatGPT before - it was originally just a preview version of a large model. Through a conversation window, it wanted to allow the public to intuitively feel the improvement of the product.
Making a podcast with NotebookLM is very simple. Directly upload a single or multiple source materials, click on the audio loading on the right, and after waiting for a few minutes, a male-female conversation audio of more than ten minutes will be produced. It also supports customizing the audio, such as which audience and which topics it should target.
Feeding the latest interview of Altman to NotebookLM makes the serious interview content become lively.
The two people in the conversation analyze the potential of AI while using humorous metaphors to make complex technical topics easy to understand and down-to-earth. For example, comparing the birth of AGI to "the arrival of a new life", or teasing that AI companies need "a smart person and 10,000 GPUs".
Scailing Law (Scaling Law) also becomes easy to understand in their conversation:
A: Imagine you are teaching a computer to recognize cats, right? A small model, it might learn pointed ears, a round face, basic knowledge, but a huge model is trained on millions of cat images.
B: It becomes the ultimate cat expert.
A: Far more than that, it can also learn breeds, emotions from expressions, and even detect tiny signs of diseases. It's simply at a detailed level.
The surprisingly realistic effect and the simple one-click experience have made many netizens eager to try it. A quick search on X can show many experience posts on "quickly making podcasts within a few minutes", all of which are achieved with the help of NotebookLM without exception.
If other AI tools are added, such as Heygen for generating digital humans, Wondercraft for editing scripts and sounds, etc., even more diverse audio and video content can be formed.
△ Source: X
To Become an Internet Celebrity, Marketing is Required First
Nowadays, generative AI products are difficult to become popular without some "tricks".
Coincidentally, the previous AI product that received a high volume of social media attention was a small complaining feature of the low-code development platform Wordware. Within 8 days of its launch, it attracted 4.26 million users, even catching the founder by surprise.
This sharp-tongued AI is just like a native netizen. It can analyze the personality of a single account and the compatibility mode of two accounts based on the content on X. It is sharp and good at playing with internet memes, making netizens feel extremely satisfied.
For example, its evaluation of the relationship between Musk and Trump is, "This is a high-risk power duo with explosive potential for innovation and controversy... They are like two alpha wolves competing to see who can howl at the moon first... Musk proposes to bomb Mars with nuclear weapons, while Trump wants to make the moon great again."
△ Wordware's sharp-tongued analysis of the relationship between Musk and Trump
Comparing the communication methods of NotebookLM and Wordware, we can find that they both release some small features that can cause widespread social media sharing in addition to the basic products.
These small tools greatly reduce the application threshold in use - even beginners can quickly get started, and the output is funny and interesting. This actually coincides with the popularity logic of Douyin and TikTok short videos globally.
Raiza Martin, the head of the NotebookLM team, also revealed an unconventional operating logic similar to "open entrepreneurship" in an interview.
The team shares the project progress on social media every day and establishes a channel on Discord, where developers gather, in order to pay attention to users' feedback and usage habits at the front line and make timely adjustments and updates. Currently, more than 60,000 users have joined.
The strategy of the video generation startup Pika is actually similar. When it first appeared, the team chose to operate on Discord and quickly gained 500,000 users.
In October of this year, the newly released Pika 1.5 model introduced a new AI template - by inputting a static image, various effects such as explosion, melting, expansion, and turning into a cake can be achieved. It is full of creativity, playing with internet memes and being funny, precisely targeting the interests of social media users.
Generative AI Applications Are Experiencing Innovation in Interaction Methods
Andrej Karpathy, the co-founder of OpenAI, analyzed the reason why the double-person podcast format is attractive as follows: Chatting is difficult, but listening to others chatting is much easier; Reading is difficult, but listening while leaning back in a chair is much easier.
Even, don't ask users to input, because people often don't know what they want until it is directly shown to them.
Looking back at the birth of Internet super applications, essentially, they all have innovations in interaction forms.
Before Douyin, most videos were horizontal. When users consumed content, they needed to constantly click, exit, select, and click again. The design of the vertical video stream of Douyin simplifies the operation to swiping up and down, greatly reducing the usage threshold.
Today's AI products are going through a similar process. After the model moves from text to multi-modal, users no longer need to manually type to interact with the model. Direct speaking is already very smooth.
The inspiration brought by NotebookLM is to transform the capabilities of large language models (intelligence quotient, context length, multi-modality, etc.) into content forms that are more consumable by users. The focus is not on the AI itself, but on scene positioning and user experience.
The ChatGPT-canvas released in October this year is an attempt by OpenAI in interaction design. Its most unique feature is that it integrates the operation of asking ChatGPT with the content of editing/coding, building a more human-computer collaborative interface.
In other words, users can directly edit text or code in the canvas. The edited document will be automatically displayed on the right side of the chat interface. Users can select the part that needs to be adjusted with the mouse, further ask GPT in the pop-up box, or adjust the length, change the reading level, fix errors, etc. through the shortcut menu.
△ The canvas page
Josh Miller, the founder of Arc Browser, believes that small companies still have opportunities, especially in user interface innovation. In other words, AI products that successfully define the user page are more likely to become killer applications and ultimately win.
Even Google Wants to Replicate Itself
After the explosion of NotebookLM, Google recently launched an AI learning tool called Learn About.
Similar to an interactive electronic encyclopedia, Learn About provides interactive articles and guides on various disciplines such as history, biology, astronomy, sports, etc., and can automatically expand and explore in-depth to accelerate knowledge learning with AI.
Just the interaction mode of Learn About: summary of key points, timeline, frequently asked questions, etc., can smell a strong "NotebookLM flavor".
△ The page of Learn About
Unlike most AI chatbots, although NotebookLM and Learn About retain the blank dialog box, more space is left for the "boxes" filled with content.
△ The page of NotebookLM
These boxes directly display the suggested topics, guides, key points, and annotations, without users having to think about "what should I ask".
With just a few clicks, knowledge will come flooding onto the screen in different modalities such as text, images, and videos. Many netizens describe this as an "Alice falling into the rabbit hole" experience.
△ Learn About presents multi-modal extended learning by sliding the page
In addition, Raiza Martin, the head of the NotebookLM team, believes that In designing products, what really needs to be considered is how to make new things intuitive and easily acceptable.
She gave a counterexample. When using NotebookLM, the first step for users is to upload the source document. For users who are used to ChatGPT, this subtle operation step is enough to cause confusion and hesitation.
Of course, the design based on the source document has its unique significance. It makes us realize again that daily creation is often based on some existing materials or documents.
This is also the reason why AI note-taking products can provide super productivity. By integrating, analyzing, and expanding various information, it reduces the detours when searching for information and builds a second brain.
From this perspective, we can also see the shadow of the development of the Internet. Previously, the second brain was an APP or a mini-program, but now it has developed into AI.
Our needs have not changed, but our requirements have become smarter and more intelligent.