StartseiteArtikel

Um OpenAI aufzuhalten, hat Claude Claude Opus 4.1 mehrerezig Minuten früher als erwartet veröffentlicht.

机器之心2025-08-06 11:36
Netzfreund: Kann man das Problem der "Dreckshaufen" (engl. "code smell" oder "legacy code", hier als "Schrottcode" verstanden) lösen?

Will you pay?

What a coincidence! Just half an hour before Sam Altman officially announced two open - source inference models, Anthropic took the lead and released a new model, Claude Opus 4.1.

Previously, it was always OpenAI stealing the limelight from others. This time, it's OpenAI's turn to be "preempted"! History is always full of drama!

Comparing their tweet times, it's just a matter of moments apart. I wonder if Anthropic got the news in advance, or it just happened to be a coincidence. Or maybe Anthropic thought OpenAI was going to release GPT - 5 and released its model ahead of time. The timing is so close that it shouldn't be a coincidence. What do you guys think?

In short, when these foreign companies launch new models, they basically follow this cycle.

The Debut of Claude Opus 4.1

The latest Claude Opus 4.1 model is built on Claude Opus 4, which was released at the end of May this year. It can be seen that Anthropic's model iteration speed is still very fast. Claude Opus 4.1 has significantly improved in agent tasks, real - world programming, and reasoning ability, with a 200K context window.

For business users and individual users, Claude Opus 4.1 is now available to users of Claude Pro, Max, Team, and Enterprise.

For developers, Claude Opus 4.1 can be used through the following platforms:

  • Anthropic API
  • Amazon Bedrock
  • Vertex AI of Google Cloud

In addition, Claude Opus 4.1 has also been integrated into Claude Code.

In terms of API pricing, Claude Opus 4.1 is priced as follows:

  • $15 per million input tokens
  • $75 per million output tokens

If you enable prompt caching, you can save up to 90% of the cost; using batch processing can save up to 50% of the cost.

Pricing: https://www.anthropic.com/pricing#api

In terms of performance, Opus 4.1 reaches 74.5% on the SWE - bench Verified benchmark. In addition, the model has also improved in in - depth research and data analysis, especially in detail tracking and agent search.

According to GitHub's evaluation, Claude Opus 4.1 has improved in most capabilities compared to Opus 4, especially showing significant progress in multi - file code refactoring.

For enterprise users, Rakuten Group found that when dealing with large codebases, Opus 4.1 can accurately locate the parts that need to be modified without making unnecessary changes or introducing new bugs. This accuracy makes their teams more willing to use it in daily debugging tasks.

Some enterprise users also said that in their junior developer evaluation benchmark tests, Opus 4.1 has improved significantly compared to Opus 4, and the performance leap is roughly equivalent to the upgrade from Sonnet 3.7 to Sonnet 4.

Anthropic recommends that all users upgrade from Opus 4 to Opus 4.1. In the API, developers can access the new model by simply using claude - opus - 4 - 1 - 20250805.

Use Cases

Claude Opus 4.1 offers a hybrid inference mode, which can achieve instant responses and also show the reasoning process. API users can also finely control the thought budget to achieve the optimal balance between cost and performance.

Its typical use cases include but are not limited to:

Advanced programming capabilities: Claude Opus 4.1 leads the SWE - bench benchmark. It can complete engineering tasks that take days and provide coherent and context - aware solutions in thousands of operations. Thanks to better code taste and support for 32K output tokens, it can flexibly adapt to specific programming styles and demonstrate excellent quality in large - scale code generation and refactoring projects.

Agent search and research: Claude Opus 4.1 can efficiently retrieve external and internal data sources and synthesize comprehensive insights in complex information environments. With its strong performance in agent search tasks, it can conduct autonomous research for hours - analyzing various materials including patent databases, academic papers, and market reports simultaneously to provide strategic insights for decision - making.

Finally, with the release of the new model, Anthropic also released a system card. Readers who are interested can go and read it.

Address: https://assets.anthropic.com/m/4c024b86c698d3d4/original/Claude-4-1-System-Card.pdf

Everyone is also looking forward to the new model solving the "code mess" problem.

However, faced with the high subscription fee, people are complaining: "It's too expensive. I can't afford it."

"It consumes too many tokens."

By the way, on the first day of Google's AI chess competition, Claude Opus 4 lost to Gemini 2.5 Pro. I wonder if the outcome would be different if Claude Opus 4.1 participated.

This article is from the WeChat official account "Almost Human" (ID: almosthuman2014), author: Someone interested in AI. It is published by 36Kr with authorization.