HomeArticle

Two major developments occurred overnight in Silicon Valley. GPT-5.3-Codex is taking on Claude 4.6, and Altman is really getting anxious.

新智元2026-02-06 16:17
ChatGPT created itself.

Within a single day, two major programming AIs have bombarded Silicon Valley one after another. After Claude Opus 4.6, Altman urgently released GPT-5.3-Codex. The battle between these two titans has completely kicked off the fight for the AI throne.

Silicon Valley won't sleep tonight!

Claude Opus 4.6 launched a surprise attack late at night without any warning. Unexpectedly, it caught Altman off guard.

In response, OpenAI rushed to the battle. In just half an hour, it urgently unveiled its most powerful intelligent programming model - GPT-5.3-Codex.

There's no GPT-5.3, only GPT-5.3-Codex!

It perfectly combines the top-notch programming capabilities of GPT-5.2-Codex with the excellent reasoning and professional knowledge capabilities of GPT-5.2, and its running speed is also improved by 25%.

It can easily handle long-range tasks involving in-depth research, tool invocation, and complex execution.

GPT-5.3-Codex is like a colleague fighting side by side with you. You can guide and interact with it in real-time while it's working, without having to worry about losing the context at all.

It's worth mentioning that GPT-5.3-Codex is also the first model to play a key role in its own creation process.

With the advent of GPT-5.3-Codex, the role of Codex has undergone a qualitative leap:

It has evolved from an AI agent that can only write and review code to an AI agent that can almost do anything that developers and professionals can do on a computer.

GPT-5.3-Codex has now been added to the ChatGPT paid plan, covering all application scenarios of Codex: App, CLI, IDE extensions, and the Web version.

Now, the entire Silicon Valley has become the "battlefield" for the final showdown between Anthropic and OpenAI. The air is filled with the smell of gunpowder.

Interestingly, Altman originally announced the release of the new model at 12 a.m., but Anthropic seized the opportunity to release its product first.

Overnight, the two most powerful programming AIs went head-to-head. Netizens complained one after another, "I simply can't keep up with the iteration speed of AI."

GPT-5.3-Codex makes its debut with stronger coding capabilities

How powerful is GPT-5.3-Codex? You'll know once you see its achievements.

New SOTA in software engineering

GPT-5.3-Codex set a new industry high in the SWE-Bench Pro evaluation, which assesses real-world software engineering.

Meanwhile, in the Terminal-Bench 2.0, which measures the terminal skills of programming agents, its performance far exceeded the previous SOTA.

It's worth mentioning that GPT-5.3-Codex consumes far fewer tokens to achieve all this than any previous model.

Compared with SWE-bench Verified, which only tests Python, SWE-Bench Pro covers four languages. It's not only more resistant to data pollution but also more challenging, diverse, and relevant to the industry.

Create games from scratch

Combining cutting-edge programming capabilities, aesthetic improvements, and compactness, GPT-5.3-Codex can produce amazing results. It can even build highly complex games and applications from scratch within a few days.

To test the model's Web development and long-range agent capabilities, OpenAI asked GPT-5.3-Codex to create two games:

The second version of the racing game released with the Codex App, and a diving game.

Using its skills in developing Web games and pre-selected general follow-up prompts (such as "fix bugs" or "improve the game"), GPT-5.3-Codex iterated on the games autonomously during millions of token interactions.

Racing game: It includes different racers, eight maps, and even items that can be triggered by the space bar.

Diving game: Players can explore various coral reefs, collect them to complete their fish encyclopedia, and manage their oxygen at the same time.

· Better understand your intentions

Compared with GPT-5.2-Codex, when you ask GPT-5.3-Codex to create a daily website, it can understand your intentions more accurately.

For simple or vaguely described prompts, it now defaults to generating websites with richer functions and more reasonable settings, providing you with a better starting canvas to help bring your ideas to life.

· GPT-5.3-Codex vs GPT-5.2-Codex

For example, when asking both GPT-5.3-Codex and GPT-5.2-Codex to build a landing page.

GPT-5.3-Codex will automatically display the annual plan as the monthly payment price after conversion, making the discount look clear and well-designed, rather than simply calculating the total annual amount.

In addition, it also created an automatic switching testimonial carousel containing three different user quotes instead of just one monotonous quote. This makes the page look more complete by default, more like a product that can be directly launched online.

GPT-5.3-Codex

GPT-5.2-Codex

Prompt:

Build a landing page for Quiet KPI, a founder-friendly weekly metrics summary. Adopt a soft SaaS aesthetic style with glassy texture cards, a gradient from lavender to blue, and a subtle blur effect. Sections include: a hero screen with email collection, a grid of example report cards, a list row of integrations, a carousel of customer testimonials, a switch between monthly/annual payment prices, a FAQ section, and a footer.

· Use Satoshi or a similar geometric sans-serif font.

· Use rounded buttons with a 14px radius and a strong focus state.

· Add a tasteful scroll-based reveal effect.

General capabilities beyond programming

The work done by software engineers, designers, product managers, and data scientists goes far beyond generating code.

GPT-5.3-Codex not only provides support for all aspects of the software lifecycle, such as debugging, deployment, monitoring, writing PRDs, editing copywriting, user research, testing, and metrics.

Moreover, it can also help users build anything they want - whether it's creating beautiful slideshows or performing complex data analysis in spreadsheets.

In the GDPval, which measures professional knowledge work, GPT-5.3-Codex performed excellently, reaching the same top level as GPT-5.2.

1. Financial advice slideshow

2. Retail training document