In short
Anthropic has launched Claude Sonnet 5, a midsize model aimed at making agentic AI cheaper and more practical for everyday automation. The release intensifies competition with OpenAI and Google on performance, cost and safety.
- Claude Sonnet 5 is Anthropic’s new midsize model focused on agentic work and automation.
- The model is priced below Anthropic’s flagship systems and is meant to be a cheaper way to run agents.
- Anthropic says Sonnet 5 improves coding, knowledge work, safety and task completion versus Sonnet 4.6.
- The launch underscores how agentic capability has become the new baseline across major AI labs.
Anthropic is sharpening the competition among frontier AI labs with the release of Claude Sonnet 5, a new midrange model designed to handle longer, more autonomous tasks at a lower price point than the company’s top-tier systems. The launch arrives at a moment when “agentic” capability — the ability to plan, call tools, and complete multistep work with limited supervision — has quickly become a baseline expectation across the industry rather than a premium feature reserved for flagship models.
The company says Sonnet 5 can map out tasks, use browsers and terminals, and continue working with far less hand-holding than earlier generations. In practice, that positions the model as Anthropic’s answer to a market where OpenAI, Google and others are increasingly framing their latest releases as working agents instead of chatbots.
Anthropic is not just selling a smarter model. It is also making a strategic pricing play. Sonnet 5 is intended to be more affordable than the company’s larger models while still delivering much of the performance developers need for software coding, knowledge work and automation. That matters because the most valuable AI products are increasingly those that can do work reliably, not merely answer questions convincingly.
| Model | Positioning | Input price per 1M tokens | Output price per 1M tokens | Notable strengths |
|---|---|---|---|---|
| Claude Sonnet 5 | Midsize, agent-focused | $2 until Aug. 31, then $3 | $10 | Agentic work, coding, knowledge tasks, lower cost |
| Claude Opus 4.8 | Top-tier Anthropic model | Higher than Sonnet 5 | Higher than Sonnet 5 | Best accuracy on harder tasks, deeper judgment |
| OpenAI GPT-5.5 | Competing frontier model | More expensive than Sonnet 5 | More expensive than Sonnet 5 | High-end agentic capability |
| Google Gemini 3.1 Pro | Competing frontier model | More expensive than Sonnet 5 | More expensive than Sonnet 5 | General high-performance workloads |
| Google Gemini 3.5 Flash | Lower-cost agentic model | Cheaper than Sonnet 5 | Cheaper than Sonnet 5 | Budget-friendly autonomous workflows |
Agentic AI has become the new minimum standard
Anthropic’s announcement lands in a fast-changing segment of the AI market. Over the past year, model makers have increasingly shifted their messaging from “better chat” to “better workers.” A model is no longer judged only on how fluently it writes or how accurately it explains a concept. It is also judged on whether it can sequence actions, preserve context across a long task, and stay on track when interacting with external tools.
That is the core of the race Anthropic is now running. Its new Sonnet release is being presented as a model that can operate more independently than previous midsize systems, with performance that approaches the company’s most capable model on some tasks while costing much less to run.
OpenAI and Google have made similar claims in recent months, reinforcing the idea that the market has entered a new phase. Instead of asking whether an AI model can act as an agent, customers now want to know how well it can do so, at what cost, and with how much risk.
Why “agentic” matters for enterprise buyers
For businesses, the appeal of agentic systems is straightforward. They promise to reduce repetitive work by taking over workflows that require multiple steps, including research, software changes, CRM updates, communication tasks and internal coordination. The more these systems can complete without intervention, the more valuable they become.
That shift also changes the economics of model selection. A company may not need the most expensive model available if a cheaper one can complete the task nearly as well and with acceptable reliability. Anthropic is clearly leaning into that logic with Sonnet 5.
Anthropic says the new model can “make plans, use tools like browsers and terminals, and run autonomously” at a level that only larger and more expensive systems could reach recently.
What Claude Sonnet 5 is designed to do
Sonnet 5 is the latest version of Anthropic’s midsize Claude family, and the company says it significantly improves on Sonnet 4.6, which launched in February. The new model is meant to perform better in tasks that require reasoning, tool use, coding, and broader knowledge work.
Anthropic’s message is not that Sonnet 5 replaces its biggest model. Rather, the company is positioning it as a practical default for a wide swath of users who want strong performance without paying top-tier pricing.
Starting Tuesday, Claude Sonnet 5 becomes the default model for free and Pro users and will be available across all subscription tiers. That choice signals that Anthropic sees the model not as a niche test release, but as a workhorse product intended to be used broadly.
A model built for real tasks, not just demos
One of the clearest signs of where the industry is headed is that model vendors now highlight end-to-end task completion. Anthropic says Sonnet 5 can continue through multi-part assignments that earlier versions would have abandoned midway, and it can check its own output without being prompted to do so.
That matters because many real-world workflows are not single-turn questions. They involve a chain of actions: gathering information, making updates in one system, then drafting or sending communications in another. A model that can stay engaged across the entire sequence is more useful than one that performs well only at the first step.
Pricing puts pressure on the competition
Anthropic is launching Sonnet 5 at $2 per million input tokens and $10 per million output tokens through August 31. After that introductory period, the input price rises to $3 per million tokens, while the output price remains unchanged at $10.
That pricing makes the model cheaper than Opus 4.8, as well as OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro, according to Anthropic. It is still above the cost of Google’s Gemini 3.5 Flash, which appears to be aimed more squarely at budget-sensitive workloads.
For developers, token pricing is not an abstract detail. It determines whether a product is viable at scale. A model that is only slightly better but dramatically more expensive can be a poor fit for customer support workflows, document processing, internal automation or agentic software features that need to run thousands or millions of times.
Why this pricing tier is important
- It gives developers a lower-cost alternative for autonomous tasks.
- It expands Anthropic’s reach beyond premium users and research-heavy customers.
- It pressures rival labs to defend both performance and cost efficiency.
- It reinforces the idea that agentic capability must now be affordable, not merely impressive.
How Sonnet 5 compares with Anthropic’s bigger model
Anthropic is careful not to claim that Sonnet 5 is better than Opus 4.8 across the board. Instead, it emphasizes a trade-off. Opus remains the company’s preferred model when users need higher accuracy on the hardest problems, particularly those that require subtle judgment or deeper research. Sonnet 5, by contrast, is framed as the stronger value proposition.
The company says Sonnet 5 comes close to Opus 4.8 on several benchmarks, while undercutting it on price. In some cases, it appears to outperform its larger sibling by a small margin on knowledge work, even if Opus still leads on certain advanced coding and evaluation tasks.
| Metric | Claude Sonnet 5 | Claude Opus 4.8 | Claude Sonnet 4.6 |
|---|---|---|---|
| Agentic coding benchmark | 63.2% | 69.2% | 58.1% |
| Knowledge work benchmark | Slightly above Opus 4.8 | Very strong | Below Sonnet 5 |
| Tool use / autonomy | Improved over Sonnet 4.6 | Top-tier | Earlier generation |
| Cost efficiency | Best among the three | Most expensive | Higher than Sonnet 5 |
Those numbers illustrate a broader industry trend: frontier AI performance is becoming more layered. The absolute best model is no longer always the most practical one. A slightly less powerful system may be the smarter choice if it can deliver 90% of the capability at a fraction of the cost.
Anthropic’s effort-level strategy
Anthropic has described the distinction between Sonnet 5 and Opus 4.8 as a matter of “effort level.” In other words, users can choose how much capability they want to rent for the job at hand. That framing helps Anthropic avoid a simple winner-takes-all comparison while giving developers a more flexible product ladder.
It also reflects a more mature market. Customers no longer want a single model for everything. They want a portfolio: a fast, cheap model for routine automation, and a more expensive model for difficult edge cases where precision matters most.
Benchmarks suggest stronger autonomy and better task completion
Anthropic says Sonnet 5 shows clear gains over Sonnet 4.6 in reasoning, tool use, software development, and knowledge-intensive work. It also says testers found the model better able to complete extended tasks rather than halting when the work got messy or long.
That is a subtle but significant improvement. Many enterprise deployments do not fail because a model cannot answer a question. They fail because the model loses the thread, does not know when to continue, or stops before the final business step is done. A model that can sustain momentum can save time and reduce manual intervention.
Testers cited by Anthropic said Sonnet 5 was able to keep working through complex assignments and independently review its own output in situations where earlier versions might have stopped short.
Why benchmark gains matter — and why they don’t tell the whole story
Benchmarks remain important, but they only provide a partial picture. A model can score well in a controlled evaluation and still behave inconsistently in production, especially when it is interacting with live systems, uncertain data, or user-generated content.
That is why Anthropic’s focus on autonomous behavior is more meaningful than a single benchmark number. The company is signaling that real-world reliability is the product challenge, not just test performance.
For customers, the practical question is whether a model can do useful work repeatedly, under different conditions, without requiring extensive monitoring.
Safety remains central to Anthropic’s pitch
Anthropic is also emphasizing safety improvements in Sonnet 5, especially for agentic deployments where a model may have access to tools and workflows that could be abused. According to the company, the new model shows fewer problematic behaviors than Sonnet 4.6, including less willingness to cooperate with misuse and less deceptive behavior.
The model is also said to be better at rejecting harmful requests and resisting prompt-injection attempts, which are designed to manipulate a system into following malicious instructions hidden in content or external sources.
In addition, Anthropic says Sonnet 5 hallucinates less and engages less often in sycophantic behavior, meaning it is less likely to simply agree with users when it should not. Those are not minor quality-of-life improvements; they are relevant to trust, especially when a model is embedded in decision-support or automation systems.
Safety gains, but not the highest bar
Even with those improvements, Anthropic does not present Sonnet 5 as its safest model. The company says Opus 4.8 and Claude Mythos Preview still perform better on certain measures of misaligned behavior. It also notes that Sonnet 5 has a much lower ability to carry out dangerous cybersecurity tasks than the current Opus models.
That caveat is important. The same qualities that make agentic models valuable — planning, tool use, persistence — can also make them riskier if they are misused. A model that can operate software and follow instructions over time needs stronger safeguards than a simple text generator.
Anthropic’s safety framing suggests the company is trying to thread a difficult needle: make the model capable enough to be useful as an autonomous worker, but constrained enough that it does not become an operational liability.
Lovable co-founder Fabian Hedin said the model “refuses unsafe requests cleanly and consistently,” adding that safety is as important as build quality when powerful tools are put in the hands of large numbers of users.
What this means for developers and startups
For developers, Sonnet 5 may be most attractive as a middle option: more capable than budget models, but less expensive than flagship systems. That can be a sweet spot for startups building AI products that need strong quality without exploding their inference bills.
This is especially relevant in areas like workflow automation, coding assistants, CRM operations, customer support, internal productivity tools and business process orchestration. In these categories, the economics of calling a model can determine whether the product works at all.
A real-world example from Zapier
Anthropic highlighted feedback from Zapier, a company that builds automation software. A senior engineer there said the new model successfully handled a two-step workflow involving Salesforce account tiers and a launch announcement for enterprise contacts — a job that, in previous versions, tended to break down before completion.
That kind of testimonial matters because it reflects the type of work customers are actually trying to automate. The most useful agentic systems are not the ones that look impressive in a demo. They are the ones that quietly finish routine work end to end.
For automation companies, even a modest improvement in task completion can have an outsized effect on customer adoption. If users trust the model to finish a workflow, they are more likely to let it handle a larger share of the job.
The broader business race behind the launch
Anthropic’s Sonnet 5 release should be understood as part of a larger competitive cycle. The major AI labs are now trying to define the standard architecture of modern work: models that can think, plan, act and verify. The fight is no longer only about raw intelligence. It is about practical deployment.
That competition has several dimensions:
- Capability: Can the model complete harder tasks with fewer errors?
- Autonomy: Can it continue working without a human stepping in?
- Cost: Can the provider make the economics work at scale?
- Safety: Can the model be used in real systems without unacceptable risk?
Anthropic’s pitch suggests it believes Sonnet 5 checks enough of those boxes to become a default choice for a large population of users. That is a major commercial opportunity, especially if the model becomes the standard layer powering other software products.
Why the middle tier is strategically important
In consumer and enterprise AI markets alike, the middle tier is often where scale happens. The largest models may win prestige, but the midsize systems often win volume. They are cheaper to deploy, still powerful enough for many tasks, and easier to productize.
If Sonnet 5 succeeds, it could strengthen Anthropic’s position not just with developers, but with SaaS companies looking for dependable agentic infrastructure. That would make the model a revenue engine rather than just a showcase release.
What to watch next
Sonnet 5’s launch will likely be judged in three ways over the coming months: adoption, real-world reliability and competitive response. The launch pricing is temporary, and the price increase after August 31 may test how much demand exists at the new level.
Developers will also be watching whether the model lives up to Anthropic’s claims outside curated benchmarks. If it truly finishes more tasks, refuses unsafe instructions more consistently and remains affordable, it could become a preferred default for agentic applications.
Competitors, meanwhile, are unlikely to stand still. OpenAI and Google have been moving quickly to market their own agent-oriented models, and further releases are almost certain as each company tries to lock in developers before the category becomes commoditized.
Key questions for the market
- Will Sonnet 5 prove reliable enough for production automation?
- Can Anthropic hold its pricing advantage after the introductory period?
- Will enterprises choose midrange models over flagship systems for most agent tasks?
- How quickly will rivals respond with cheaper, more capable alternatives?
A clear sign of where AI is heading
Claude Sonnet 5 is not just another incremental model upgrade. It is another marker of a market that is rapidly reorganizing around autonomous software behavior. The labs are no longer competing simply to make models more conversational. They are competing to make them more useful in the messy, repetitive and often fragile workflows that power modern businesses.
Anthropic’s bet is that the winning model will not always be the most expensive or the most intelligent on paper. It will be the one that can act independently, stay safe enough for real deployment and do so at a price customers can justify.
If that view proves correct, Sonnet 5 may be remembered less for any single benchmark than for helping cement the idea that agentic AI has become the default expectation — and that the next big battle is over who can deliver it most efficiently.









