Sunday, March 29, 2026
Mobile Offer

🎁 You've Got 1 Reward Left

Check if your device is eligible for instant bonuses.

Unlock Now
Survey Cash

🧠 Discover the Simple Money Trick

This quick task could pay you today — no joke.

See It Now
Top Deals

📦 Top Freebies Available Near You

Get hot mobile rewards now. Limited time offers.

Get Started
Game Offer

🎮 Unlock Premium Game Packs

Boost your favorite game with hidden bonuses.

Claim Now
Money Offers

💸 Earn Instantly With This Task

No fees, no waiting — your earnings could be 1 click away.

Start Earning
Crypto Airdrop

🚀 Claim Free Crypto in Seconds

Register & grab real tokens now. Zero investment needed.

Get Tokens
Food Offers

🍔 Get Free Food Coupons

Claim your free fast food deals instantly.

Grab Coupons
VIP Offers

🎉 Join Our VIP Club

Access secret deals and daily giveaways.

Join Now
Mystery Offer

🎁 Mystery Gift Waiting for You

Click to reveal your surprise prize now!

Reveal Gift
App Bonus

📱 Download & Get Bonus

New apps giving out free rewards daily.

Download Now
Exclusive Deals

💎 Exclusive Offers Just for You

Unlock hidden discounts and perks.

Unlock Deals
Movie Offer

🎬 Watch Paid Movies Free

Stream your favorite flicks with no cost.

Watch Now
Prize Offer

🏆 Enter to Win Big Prizes

Join contests and win amazing rewards.

Enter Now
Life Hack

💡 Simple Life Hack to Save Cash

Try this now and watch your savings grow.

Learn More
Top Apps

📲 Top Apps Giving Gifts

Download & get rewards instantly.

Get Gifts
Summer Drinks

🍹 Summer Cocktails Recipes

Make refreshing drinks at home easily.

Get Recipes

Latest Posts

Anthropic Launches Claude Sonnet 4.5 with New Coding and Agentic State-of-the-Art Results


Anthropic released Claude Sonnet 4.5 and sets a new benchmark for end-to-end software engineering and real-world computer use. The update also ships concrete product surface changes (Claude Code checkpoints, a native VS Code extension, API memory/context tools) and an Agent SDK that exposes the same scaffolding Anthropic uses internally. Pricing remains unchanged from Sonnet 4 ($3 input / $15 output per million tokens).

What’s actually new?

  • SWE-bench Verified record. Anthropic reports 77.2% accuracy on the 500-problem SWE-bench Verified dataset using a simple two-tool scaffold (bash + file edit), averaged over 10 runs, no test-time compute, 200K “thinking” budget. A 1M-context setting reaches 78.2%, and a higher-compute setting with parallel sampling and rejection raises this to 82.0%.
  • Computer-use SOTA. On OSWorld-Verified, Sonnet 4.5 leads at 61.4%, up from Sonnet 4’s 42.2%, reflecting stronger tool control and UI manipulation for browser/desktop tasks.
  • Long-horizon autonomy. The team observed >30 hours of uninterrupted focus on multi-step coding tasks — a practical jump over earlier limits and directly relevant to agent reliability.
  • Reasoning/math. The release notes “substantial gains” across common reasoning and math evals; exact per-bench numbers (e.g., AIME config). Safety posture is ASL-3 with strengthened defenses against prompt-injection.
https://www.anthropic.com/news/claude-sonnet-4-5

What’s there for agents?

Sonnet 4.5 targets the brittle parts of real agents: extended planning, memory, and reliable tool orchestration. Anthropic’s Claude Agent SDK exposes their production patterns (memory management for long-running tasks, permissioning, sub-agent coordination) rather than just a bare LLM endpoint. That means teams can reproduce the same scaffolding used by Claude Code (now with checkpoints, a refreshed terminal, and VS Code integration) to keep multi-hour jobs coherent and reversible.

On measured tasks that simulate “using a computer,” the 19-point jump on OSWorld-Verified is notable; it tracks with the model’s ability to navigate, fill spreadsheets, and complete web flows in Anthropic’s browser demo. For enterprises experimenting with agentic RPA-style work, higher OSWorld scores usually correlate with lower intervention rates during execution.

Where you can run it?

  • Anthropic API & apps. Model ID claude-sonnet-4-5; price parity with Sonnet 4. File creation and code execution are now available directly in Claude apps for paid tiers.
  • AWS Bedrock. Available via Bedrock with integration paths to AgentCore; AWS highlights long-horizon agent sessions, memory/context features, and operational controls (observability, session isolation).
  • Google Cloud Vertex AI. GA on Vertex AI with support for multi-agent orchestration via ADK/Agent Engine, provisioned throughput, 1M-token analysis jobs, and prompt caching.
  • GitHub Copilot. Public preview rollout across Copilot Chat (VS Code, web, mobile) and Copilot CLI; organizations can enable via policy, and BYO key is supported in VS Code.

Summary

With a documented 77.2% SWE-bench Verified score under transparent constraints, a 61.4% OSWorld-Verified computer-use lead, and practical updates (checkpoints, SDK, Copilot/Bedrock/Vertex availability), Claude Sonnet 4.5 is developed for long-running, tool-heavy agent workloads rather than short demo prompts. Independent replication will determine how durable the “best for coding” claim is, but the design targets (autonomy, scaffolding, and computer control) are aligned with real production pain points today.


Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.

🔥[Recommended Read] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI





Source link

Latest Posts

Don't Miss

Stay in touch

To be updated with all the latest news, offers and special announcements.