"Context engineering" has become extremely popular in Silicon Valley, and Karpathy himself has lent his support, causing prompt engineering to fall out of favor overnight.
What's hot in Silicon Valley these days isn't prompt engineering anymore, but context engineering!
Even AI guru Karpathy has cast his vote for "context engineering."
Shopify CEO Tobias Lütke also said that he prefers "context engineering" because it accurately describes a core skill -
The art of enabling large models to reasonably solve problems by providing complete background information for tasks.
Overnight, "context engineering" has become a sensation across the internet. Why is that?
Context Engineering, an Overnight Sensation
The reason behind this is inseparable from the rise of AI agents.
OpenAI President Greg Brockman has publicly stated multiple times that "2025 is the year of AI agents."
The most crucial factor determining the success or failure of an agent is the "quality of context" provided. That is to say, the information loaded into the "limited working memory" becomes even more important.
In most cases where AI agents fail, it's not the failure of the model, but the failure of the context!
So, what is context?
To understand "context engineering," we first need to expand the definition of "context."
It's not just a single prompt you send to the LLM. It can be regarded as "all the content the model sees before generating a response," as follows:
Instruction/System Prompt: The initial set of instructions that define the model's behavior in the dialogue, which can/should include examples, rules, etc.
User Prompt: The user's immediate task or question.
State/History (Short-term Memory): The current dialogue, including the user's and the model's responses, up to this moment.
Long-term Memory: A persistent knowledge base collected across multiple previous dialogues, containing learned user preferences, summaries of past projects, or facts to be remembered for future use.
Retrieval Information (RAG): External, real-time knowledge, relevant information from documents, databases, or APIs, used to answer specific questions.
Available Tools: The definitions of all functions or built-in tools that the model can call, such as check_inventory, send_email.
Structured Output: The definition of the model's response format, such as a JSON object.
It can be seen that, different from "prompt engineering" which focuses on carefully constructing the perfect instruction in a single text string, the scope of "context engineering" is much broader.
To put it simply:
"Context engineering" is a discipline that is committed to designing and building dynamic systems.
These systems can provide the right information and tools in the right format at the right time, so that the LLM has everything it needs to complete the task.
Here are all the characteristics of "context engineering":
· It's a system, not a string: Context is not a static prompt template, but the output of a system that runs before the main call to the LLM.
· It's dynamic: Context is generated on the fly, tailored to the current task. For example, one request may require calendar data, while another may require email content or web search results.
· It emphasizes providing the right information and tools at the right time: Its core task is to ensure that the model doesn't miss key details (keep in mind the "garbage in, garbage out" principle). This means providing knowledge (information) and capabilities (tools) to the model only when necessary and beneficial.
· It pays attention to format: The way information is presented is crucial. A concise summary is far better than a list of raw data; a clear definition of a tool interface is also much more effective than a vague instruction.
Both a Science and an Art
In Karpathy's long review, he also believes that "context engineering" is a form of art.
People often associate prompts with the short task descriptions sent to the LLM in daily use.
However, in any industrial-grade LLM application, context engineering is both a profound science and a delicate art.
Its core lies in precisely filling the context window with the right information for the next operation.
Saying it's a science is because to do this well, a series of technologies need to be comprehensively applied, including:
Task description and explanation, few-shot learning examples, RAG (Retrieval Augmented Generation), relevant (possibly multimodal) data, tools, state and history records, information compression, etc.
If there is too little information or the format is incorrect, the LLM won't have enough context to achieve optimal performance;
If there is too much information or it's not relevant enough, it will lead to increased costs and decreased performance of the LLM.
Doing this well is quite complex.
Saying it's an art is because it requires developers to rely on their intuitive understanding and guidance of the "temperament" of large models.
In addition to context engineering itself, an LLM application must also:
Properly break down the problem into a control flow
Precisely fill the context window
Dispatch the call request to an LLM with the appropriate type and capabilities
Handle the "generate-verify" UIUX process
And more - such as safety guards, system security, effect evaluation, parallel processing, data prefetching, etc...
Therefore, "context engineering" is just a small part of an emerging, complex, and substantial software layer.
This software layer is responsible for integrating and coordinating individual LLM calls and many other operations to build a complete LLM application.
Karpathy said that casually referring to such applications as "ChatGPT wrappers" is not only outdated but also completely wrong.
Some netizens joked that context engineering is the new "atmospheric programming."
Karpathy responded, "I'm not trying to coin a new term. I just think that when people mention 'prompts,' they tend to oversimplify what is actually a rather complex component."
You might use a prompt to ask the LLM "Why is the sky blue?" But for an application, it needs to build context for the large model to solve tasks tailored to it.
The Success or Failure of Agents Depends Entirely on It
Actually, the secret to creating truly efficient AI agents doesn't lie in how complex the code is, but in how high - quality the context you provide is.
The fundamental difference between a rough demonstration product and a stunning agent lies in the quality of the context provided.
Imagine an AI assistant needs to arrange a meeting based on a simple email:
Hey, I was wondering if you're free for a quick meeting tomorrow?
The "rough demonstration" agent gets very poor context. It can only see the user's request and nothing else.
Its code may be fully functional - calling an LLM and getting a response, but the output is unhelpful and very mechanical:
Thank you for your message. I'm available tomorrow. What time would you like to schedule the meeting?
Next, let's take a look at the stunning agent empowered by rich context.
The main task of its code is not to think about how to reply, but to collect the information the LLM needs to achieve its goal. Before calling the LLM, you'll expand the context to include:
The main work of the code is not to decide how to respond, but to collect the information the LLM needs to complete the goal.
Before calling the LLM, you'll expand the context, including:
Calendar information: showing that you're fully booked all day
Past emails with this person: to determine what kind of informal tone to use
Contact list: to identify that the other person is an important partner
Tools for send_invite or send_email
Then, you can generate a reply like this:
Hey, Jim! I'm completely booked tomorrow with back - to - back meetings. I'm available Thursday morning. Is that convenient for you? I've sent you an invitation. Let me know if this time works for you.
The secret to this stunning effect doesn't lie in a smarter model or a more sophisticated algorithm, but in providing the right context for the right task.
This is exactly why "context engineering" will become crucial.
So, the failure of an agent is not just the failure of the model, but the failure of the context.
To build powerful and reliable AI agents, we're gradually moving away from the path of looking for "universal prompts" or relying on model updates.
This has won the approval of netizens.
Its core lies in the engineering construction of context: providing the right information and tools in the right format at the right time.
This is a cross - functional challenge that requires us to deeply understand business use cases, clearly define the output, and carefully organize all necessary information so that the LLM can truly "complete the task."
Finally, borrowing a line from a netizen, "memory" is the last piece of the AGI puzzle.
References:
https://www.philschmid.de/context-engineering
https://news.ycombinator.com/item?id=44427757
Editor: Taozi
This article is from the WeChat official account "New Intelligence Yuan", author: New Intelligence Yuan. Republished by 36Kr with permission.