HomeArticle

RAG Sentenced to Death: Google Undermines Engineers with One Line of API

新智元2025-11-26 14:54
The Migration of Power: From Engineers to Platforms

Google Declares RAG Dead! The once - proud technology chain of countless engineers now boils down to a single API call. Gemini's File Search encapsulates retrieval, chunking, indexing, and referencing all within the model. Developers no longer need to understand the process; they just need to upload files. When intelligence is swallowed by automation, engineers realize for the first time that they have become part of the automated process.

In the past few years, RAG has been the confidence of engineers.

They manually chunked files, generated vectors, built indexes, and then precisely pieced the retrieved content into prompts.

It was a whole set of delicate and cumbersome engineering work. Only those who truly understood these processes dared to claim they "knew how to use large models."

Now, this pride has been erased by a single API.

After the launch of Gemini's File Search, by uploading a PDF or JSON file, the model can automatically complete chunking, retrieval, referencing, and even attach the source.

File Search abstracts the entire retrieval process.

This statement is like a sharp knife, cutting off the last link between humans and the system.

AI no longer needs engineers to teach it how to search for information, and engineers are starting to be optimized out by their own inventions.

From Process to Function: Google Cuts the RAG Engineering Chain with One Stroke

After the release of Gemini's File Search, RAG has transformed from an engineering system into a built - in API capability.

By uploading a file, the model automatically completes chunking, embedding (vectorization), indexing, retrieval, and referencing, all within the same interface. There's no need to build a vector library or maintain retrieval logic.

Multi - format support is also integrated at once: PDF, DOCX, TXT, JSON, and common code files can be directly parsed and embedded, enabling the rapid construction of a unified knowledge base.

This allows developers to quickly build a complete knowledge base without having to make additional adaptations for file types or structures.

In the update description, it is defined as:

A fully managed RAG system directly embedded in the Gemini API, with the retrieval process completely abstracted.

Developers no longer need to design chunking strategies or index structures. The system will automatically complete all steps in the background.

The workflow of Gemini File Search: Upload file → Automatically generate embeddings → Call Gemini for retrieval and generate answers → Output results with references

The pricing is set as a "light entry": Storage and embedding generation during queries are free; only the initial indexing is charged at $0.15 per million tokens, which makes the marginal cost of deployment and expansion approach zero.

This means that the cost of building knowledge retrieval has almost dropped to zero, and the technical threshold of RAG has been absorbed by the platform.

The Logic of File Search: Embedding RAG into the API

The core of File Search isn't about whether it can search, but about hiding the entire retrieval chain.

In the past, to make the model answer questions based on external materials, one had to build a RAG process:

First, cut the file into small chunks, then convert each chunk into vectors using an embedding model and store them in a vector database. When the user asks a question, retrieve the most relevant segments and insert the results into the prompt to generate an answer.

The Ask the Manual demonstration application is powered by the new file search tool in the Gemini API

The entire process requires maintaining the database, managing indexes, adjusting parameters, and splicing prompts, with each step relying on engineers.

Now, all of this has been integrated into the underlying layer of the Gemini API.

Upload a file, and the system automatically completes chunking, embedding, and indexing. When asking a question, just call the same generateContent interface. Gemini will perform semantic retrieval and context injection internally and automatically generate references in the answer.

It even uses the dedicated gemini - embedding - 001 model to ensure that the semantic spaces of retrieval and generation are completely consistent.

Upload a document about the Hyundai i10 and ask "What is the Hyundai i10?" Gemini will retrieve relevant paragraphs, write a well - reasoned answer, and display the source supporting the answer.

More importantly, File Search has rewritten the development logic.

Developers no longer need to deploy an additional database or maintain a retrieval pipeline; the entire process is completed in a single call.

This means that RAG has changed from an independent system to a parameter.

What used to require hundreds of lines of code to run through the process is now just one line of configuration. The official example call is as follows:

When all retrieval, storage, injection, and referencing are automatically completed, engineers no longer need to understand how the system finds the answer.

File Search has transformed RAG from knowledge that needs to be mastered into a function that can be called.

At that moment, technology is no longer a skill but an option.

Engineers Losing Their Jobs

The launch of File Search isn't just a tool upgrade; it's a role shift.

It gives the system the ability to self - construct, being able to automatically chunk, index, retrieve, and reference.

In the past, understanding these logics was the value of engineers; now, this understanding is completely hidden.

In the early adoption, the changes at Beam (Phaser Studio) are the most obvious:

They integrated File Search into their content production line to retrieve templates, components, and design documents. With thousands of queries per day across six corpora, the results are merged within two seconds.

CTO Richard Davey said:

Work that used to take days can now be completed in minutes.

Of course, this is an improvement in productivity, but it also means that engineers have lost the right to explain the system.

When the retrieval strategy, referencing logic, and even the data structure are controlled by the platform, engineering is no longer about building a system but about calling a system.

From the outside, it just means writing a few hundred fewer lines of code; but from the inside, it's the moment when the knowledge density is absorbed by the platform.

When complexity is hidden, people become replaceable.

The Transfer of Power: From Engineers to the Platform

The emergence of File Search doesn't really reshape the development experience but the power structure. It changes those who understand the system from engineers to the platform.

In the traditional RAG process, engineers have control over the system.

They can decide how to chunk, index, and retrieve, and can also explain why the model gives a certain answer.

This sense of control comes from visibility; they can see the logic of each step.

File Search puts the visible engineering steps into an invisible API.

The retrieval strategy, index structure, and referencing rules are hosted in the cloud. Developers can only see the answers, not the process.

This means that the power of knowledge injection is being concentrated. The platform decides which paragraphs the model answers based on, which evidence is ignored, and how to weigh the retrieval results.

Engineers no longer "build the system" but "call the system."

This isn't an isolated case. OpenAI's Custom GPTs, Anthropic's Console, and Gemini's File Search are all pushing complexity to the platform's underlying layer, making development easier and more controlled.

Each abstraction is a concentration of power.

The birth of File Search has brought AI development into the zero - configuration era:

People no longer need to understand the model but just call it. The platform no longer provides capabilities but directly provides results.

This change has no dramatic conflicts but completely changes the boundaries of technology.

When the system builds itself, personal understanding is replaced by trust in the platform.

File Search doesn't "kill" RAG; it just turns RAG into the blood of the system.

Complexity is hidden, and power is concentrated. What engineers need to do is find a new entry point in a higher - level encapsulation.

References:  

https://blog.google/technology/developers/file-search-gemini-api/ 

https://x.com/frxiaobei/status/1990091775382602021?s=20 

https://medium.com/%40abdulkadir9929/gemini-apis-new-file-search-tool-built-in-rag-for-everyone-e990c054dcff 

This article is from the WeChat official account "New Intelligence Yuan". The author is New Intelligence Yuan. It is published by 36Kr with permission.