Codex In - Charge Refutes Cursor CEO's "Specification - Driven Development Theory"; OpenAI Creates Hit Sora in 18 Days with 24/7 Agents, Unveiling Rapid Development Secrets

The user base of Codex has grown 20-fold, breaking through the bottleneck of long-duration tasks.

Since the release of GPT - 5 in August, Codex has demonstrated astonishing explosive power. The number of users has increased by 20 times, and it processes trillions of tokens every week, making it the most popular programming agent at OpenAI.

“The 20 - fold growth of Codex is not just because the model has become more powerful, but also because we understand that a real agent is not just a model, but the result of the joint efforts of the model, API, and framework.” In the latest podcast, Alexander Embiricos, the product lead of OpenAI's programming agent Codex, revealed the secrets behind it.

For example, Codex has made breakthroughs in long - term task capabilities. To enable it to work continuously for more than a dozen hours or even days, the team designed a mechanism called “compression” — the model is responsible for extracting key information, the API undertakes the task chain, and the framework is responsible for stable operation. The three layers fit together like gears, enabling Codex to complete long - term programming tasks that traditional large models struggle to support.

It is this underlying logic that allows Codex to perform amazingly in real business scenarios.

Andrej Karpathy once publicly shared that he was stuck on a bug for several hours and finally handed it over to Codex. It completed the fix within an hour.

The Sora team relied on Codex to launch an Android application from scratch in just 28 days and directly reached the top of the App Store.

Looking back, Alexander Embiricos also admitted that the path of Codex was not clear from the beginning.

Early Codex was “too futuristic”, using a remote asynchronous interaction method. This suited the habits of senior engineers but was not friendly to most engineers. The real turning point came from a key adjustment: the team moved Codex from the cloud back to the local environment, allowing it to work directly in the engineer's IDE, making it more practical.

In Alexander Embiricos' view, the current Codex is like a “smart but passive intern who can write code very quickly.” Codex has been self - monitoring, self - training, and constantly evolving. Alexander Embiricos hopes that in the future, Codex can truly participate in the entire software development process and become a good teammate for engineers.

Alexander Embiricos also talked about OpenAI's organizational culture. He was amazed by OpenAI's speed and ambition, and the iteration speed was unprecedented. Compared with other organizations' “aim before shooting,” he believes that the uniqueness of OpenAI lies in “shoot first, then aim,” that is, release first and then optimize the path based on real usage feedback. Gathering the world's best talents and a bottom - up culture make this high - speed iteration a daily routine.

Regarding when AGI will arrive in the future, Alexander Embiricos also provided an interesting perspective. He believes that the current real limiting factor for AGI is not the model's ability, but humans — our limited input speed and review speed are dragging down its development .

He made a prediction. The first group of users with a sharp increase in the productivity curve will appear next year, and the subsequent changes will spread at an accelerating pace. “When the growth curve suddenly becomes extremely steep,” he said, “we may already be at the door of AGI.”

The podcast also shared more details behind Codex and Alex's wonderful views. We translated the content and made deletions and arrangements without changing the original meaning to satisfy readers.

Highlights of the podcast:

With the help of Codex, OpenAI built the Sora Android app in just a few weeks through the collaboration of two or three engineers and made it rank first in the App Store. The Sora app took only 18 days from scratch to employee testing and was officially released 10 days later. Codex provided great help by analyzing existing iOS apps, formulating work plans, and comparing the two platforms simultaneously to implement functions.
Even if AI models stop improving tomorrow, we still need to spend several years on product development to fully realize their potential. The development speed of this technology exceeds our current ability to make the best use of it.
The key to making full use of Codex is to choose the most difficult problems, not the simplest ones. These tools are designed to solve tricky bugs and complex tasks, not simple ones. Start with the problems that usually take you hours to solve.
OpenAI's original Codex product was “too advanced.” It ran asynchronously in the cloud, which was great for advanced users but difficult for beginners. When it brought Codex back to where engineers work daily — the code editor on their own computers — its growth rate exploded. In the past six months, the usage of Codex has increased by 20 times.
Writing code may become a universal way for AI to complete any task. Instead of clicking on interfaces or building separate integrations, AI can perform best by writing small programs on the fly. This means that every AI assistant should have built - in coding capabilities, not just specialized programming tools.
OpenAI's designers now write and publish their own code. The design team maintains a fully functional prototype built with the assistance of AI. When they have an idea, they directly write code, test it, and often submit it to the production environment by themselves. Engineers only intervene when the codebase is particularly complex.
The biggest bottleneck in AI productivity is not AI itself, but human typing speed. The limiting factor lies in the speed at which you input prompts and the speed at which you review the work generated by AI. Before AI can more reliably verify its own output and actively provide help, we will not be able to see the full productivity improvement that these tools can bring.
The joy of writing code is gradually being replaced by reviewing the code generated by AI. Engineers used to love the creative process of building code, but now they spend more time reading the code generated by AI. The next challenge is how to make the code review process faster and more satisfying.
New AI models can now work continuously for 24 to more than 60 hours to complete a single task. A technology called “compression” allows AI to summarize what it has learned before running out of memory and then continue working in a new session. This enables AI to achieve overnight or multi - day autonomous work that was previously impossible.
If you want to start a company now, having an in - depth understanding of specific customers is more important than being good at product development. Product development is becoming easier. Nowadays, the real advantage lies in knowing what products to develop and for whom.

OpenAI's Speed, Culture, and Talent Management

Lenny: I'd like to start with your experience at OpenAI. You joined OpenAI about a year ago. Before that, you founded your own company for about five years, and before that, you worked as a product manager at Dropbox. OpenAI is obviously a very different place from all the companies you've worked at in the past. I'm wondering, what is the most special way of operation at OpenAI? What have you learned there that you think you'll take with you no matter where you go in the future (assuming you'll leave one day)?

Alex: So far, I think the pace and ambition of working at OpenAI far exceed my imagination. When I say this, I think about the startup circle in the past. Everyone thought their company was fast - paced, had high talent requirements, and had grand goals. But after coming to OpenAI, I realized that these words mean completely different scales here. At OpenAI, I redefined what “speed” and “ambition” really mean.

We often hear the outside world exclaiming about the fast development speed of AI companies. The first example that comes to my mind is the explosive growth of the models themselves. Although we have expanded the scale of external data, the ten - fold growth of models like Codex was completed in just a few months, and the subsequent progress continued to accelerate. At least for me, after experiencing those stages, I found that when building technology products, I would naturally set the goal to achieve that speed and scale; otherwise, I would feel it was not enough. In contrast, the pace I experienced in startup companies seemed much slower.

In the process of starting a business, there is often a need to balance the investment and the possibility of failure: try first and then pivot. However, at OpenAI, I deeply realized the huge influence, and to do the work well, it requires a very high level of energy. This need forces me to arrange my time more decisively.

Lenny: Before we continue, I'd like to ask a follow - up question: Is there a special organizational structure or structural reason for a team like Codex to advance so rapidly? Or is it because I don't understand the operation mode of open - source software well enough, so the team can move forward so quickly? There must be some structure that makes all this happen.

Alex: On the one hand, the technology we use has completely changed many things, including the way we build products and the functions we can achieve for users. Although we often discuss the improvement of the base models, even if we stop the progress at the model level (which is not the case in fact), we are still far behind in product development, and there are still a large number of products that have not been realized. It can be said that the maturity of this field is much higher than the outside world imagines.

However, there are also many unexpected things. When I first arrived at OpenAI, I didn't know much about the organizational structure. For example, in the past, when working as a product manager in a startup company or at Dropbox, inspiring the team and ensuring that the team was moving in the right direction were extremely important standard tasks. But at OpenAI, since we don't exactly know which functions will appear in the near future and which ones will ultimately work, even if they are technically feasible, we can't be sure of the final result. Therefore, we need to remain humble and learn through continuous experimentation.

The organizational structure here is designed to operate in a highly bottom - up manner, and everyone is eager to move forward quickly. Many companies claim to be bottom - up, but OpenAI truly is. This has been a valuable learning experience for me, and it has also made me realize that it may be difficult for me to work in a non - AI company in the future. I'm not even sure what that would mean. If I were to go back to the past, my way of doing things would definitely be completely different.

Lenny: From what you've described, it sounds more like “ready, fire, aim” rather than “ready, aim, fire.” The thinking of many AI companies seems to be that since we don't know how users will ultimately use the product, it's meaningless to spend a lot of time making the product perfect. The best way is to release it as soon as possible, observe how people use it, and then iterate quickly.

Alex: This analogy makes some sense, but the target itself is vague. We roughly predict what might happen in the future, but there is still a lot of uncertainty. A research director often says that at OpenAI, we can have high - quality conversations about the future one year from now, but the closer we get to that time, the more difficult it is to make rational plans. We will conceive what we want to achieve in the long - term future, especially on issues such as AI alignment, where we must consider the very long - term future. But when we really enter the product stage, we will start to focus on tactical details, such as what specific products to build and how people will actually use them. In terms of products, we rely more on empirical research for verification.

Lenny: When people hear what you do, they may think that a company like yours can boldly try many things and doesn't need to make strict plans in the next few months. But the key is that you hire the best talents in the world, which seems to be the key factor for the success of such a company.

Alex: I really resonate with this. When I first joined, I was shocked by the personal motivation and autonomy shown by everyone. I think the way OpenAI operates cannot be replicated in other companies just by reading an article or listening to a podcast. It may be a bit blunt to say this, but few companies have the talents who can operate in this way. If this model is to appear elsewhere, it will probably have to make many adjustments.

Codex's Positioning, Core Philosophy, and Product Vision

Lenny: Let's talk about Codex. You're the lead of Codex. How is Codex progressing now? Can you share some data? Also, not everyone is clear about what Codex is. Can you explain it?

Alex: Codex is an open - source coding agent. More specifically, it is an IDE extension for VS Code. You can install the extension or the terminal tool. After installation, you can interact with Codex, answer code - related questions, write code, run tests, execute code, etc., which are the most laborious parts of the software development lifecycle — actually writing the code that will be deployed to the production environment.

More broadly, we think Codex is the starting point for software engineering team members. When we talk about terms like “teammate,” we envision it not only writing code but also participating in the entire software writing process, from conception and planning to downstream verification, deployment, and maintenance.

I like to imagine today's Codex as a very smart intern who doesn't check Slack and doesn't actively check monitoring systems like Sentry unless you ask it to. So, even though it's very smart, you won't completely let it write code without supervision. Most people still use it in a pair - programming mode.

But we hope to reach a point in the future where, just like when you hire a new intern, you don't just let them write code but also let them participate in the entire process. Even if their first attempt isn't completely correct, they will continue to participate and eventually achieve the goal through iteration.

Lenny: I understand what you mean by “not checking Slack,” which means it won't be distracted and is always fully focused on work. But you mean it can't grasp the complete context of what's going on, right? For the best members of the team, you won't tell them what to do at every step. You just let them understand your communication style in the first few meetings, and then they can work independently in the team and even actively collaborate with other parts of the codebase.

Alex: Yes, we think a truly excellent teammate should be proactive, and one of the main goals of Codex is to make the agent have this kind of proactiveness. I think this is an important part of achieving OpenAI's mission — making AI truly benefit all of humanity.

Current AI products are actually very difficult to use because users must think very clearly about when to ask the model for help. If you don't actively make a request, the model can't provide help. Nowadays, an ordinary user may only issue dozens of instructions to AI every day. But if a system is truly intelligent, humans can get thousands of times of help from it every day.

Therefore, most of our goals in working with Codex are to figure out how to build a “teammate agent” that can provide help by default.

Lenny: When people think of Cursor or cloud code tools, they are like integrated development environments that can assist in writing code and provide auto - completion. I can tell from your description that your vision is different. You're referring to a system that can truly build code for you like a remote teammate. You can talk to it and let it perform operations, and it also has functions like IDE auto - completion. How is your way of thinking about Codex different?

Alex: Our goal is to make developers feel like they have superpowers when completing tasks, to complete work at a faster speed, and at the same time, not have to stop everywhere and think about “how should I call AI to complete this now?” It should collaborate with you like a part of the system. When you take an action, it can automatically start working without you having to worry about it.

Codex's Technological Breakthroughs, Growth Drivers, and Three - Layer System Structure

Lenny: I have many similar questions. But I'd like to ask first: How is Codex progressing? Are there any statistics or figures that you can share?

Alex: Codex has been experiencing explosive growth since its release. GPT - 5 was released in August, and at that time, we had already observed some very interesting product insights. If you're interested, I can explain in detail how this growth happened. The last public data we had was that the usage of Codex had increased by more than ten times since August, and now it has reached twenty times. The Codex model currently serves trillions of tokens every week and is our most popular coding model.

One important thing we did was to form a closely integrated team that allowed the product and research teams to jointly iterate on the model and framework. This way, we can conduct more experiments more quickly and understand how the model and tools collaborate. When we train the model, we use our own framework and have very clear views on the framework. Recently, we have started to see other large API coding customers adopting similar methods, and these models have become more general.

Now, Codex has become the most widely used coding model, and there is also a corresponding version in the API.

Lenny: You mentioned that “growth was unlocked,” and I'm very curious about this. Before you joined, it seemed to me that cloud code dominated everything. Everyone was using it, and it was the best way to generate code at that time. But then Codex appeared. I remember Karpathy posted a tweet saying that he had never seen such a model. He

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The person in charge of Codex refutes the "specification-driven development theory" of the Cursor CEO. It took only 18 days to create the hit product Sora, relying on agents running 24/7. The insider secrets of OpenAI's rapid development are revealed.

Highlights of the podcast:

OpenAI's Speed, Culture, and Talent Management

Codex's Positioning, Core Philosophy, and Product Vision

Codex's Technological Breakthroughs, Growth Drivers, and Three - Layer System Structure