Anthropic Releases Claude 4 Models That Can Autonomously Work For Nearly a Full Corporate Workday (anthropic.com) 26

Posted by msmash on Thursday May 22, 2025 @01:20PM from the moving-forward dept.

Anthropic launched Claude Opus 4 and Claude Sonnet 4 today, positioning Opus 4 as the world's leading coding model with 72.5% performance on SWE-bench and 43.2% on Terminal-bench. Both models feature hybrid architecture supporting near-instant responses and extended thinking modes for complex reasoning tasks.

The models introduce parallel tool execution and memory capabilities that allow Claude to extract and save key facts when given local file access. Claude Code, previously in research preview, is now generally available with new VS Code and JetBrains integrations that display edits directly in developers' files. GitHub integration enables Claude to respond to pull request feedback and fix CI errors through a new beta SDK.

Pricing remains consistent with previous generations at $15/$75 per million tokens for Opus 4 and $3/$15 for Sonnet 4. Both models are available through Claude's web interface, the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. Extended thinking capabilities are included in Pro, Max, Team, and Enterprise plans, with Sonnet 4 also available to free users.

The startup, which counts Amazon and Google among its investors, said Claude Opus 4 could autonomously work for nearly a full corporate workday -- seven hours. CNBC adds: "I do a lot of writing with Claude, and I think prior to Opus 4 and Sonnet 4, I was mostly using the models as a thinking partner, but still doing most of the writing myself," Mike Krieger, Anthropic's chief product officer, said in an interview. "And they've crossed this threshold where now most of my writing is actually ... Opus mostly, and it now is unrecognizable from my writing."

Krieger added, "I love that we're kind of pushing the frontier on two sides. Like one is the coding piece and agentic behavior overall, and that's powering a lot of these coding startups. ... But then also, we're pushing the frontier on how these models can actually learn from and then be a really useful writing partner, too."

Anthropic Releases Claude 4 Models That Can Autonomously Work For Nearly a Full Corporate Workday

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 26 Comments Log In/Create an Account

Comments Filter:

- Re: huh? (Score:3)
  
  by fluffernutter ( 1411889 ) writes:
  
  It can work with other chatbots of course! They'll be able to multiply the amount of power they use doing their useful work!!
  - Re: (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    and they can have meetings, and they can schedule them all electronically.
    I wonder if they will have marathon meetings and then begin to fall asleep?
    or maybe search the web while the other AI drones on and on...
    This is all starting to sound pretty familiar :-)
    - Re: huh? (Score:2)
      
      by fluffernutter ( 1411889 ) writes:
      
      This is sounding like a Pixar film!
I want to know (Score:5, Funny)

by newslash.formatblows ( 2011678 ) writes: on Thursday May 22, 2025 @01:53PM (#65396299)

what happens after 7 hours? Claude needs lunch? A hug? It goes batshit and deletes the whole repository?

- Re: (Score:3)
  
  by Dan East ( 318230 ) writes:
  
  It means the other 17 hours it produces unusable nonsense that superficially looks correct that a human then has to spend 40 hours sorting out and fixing.
  - Re: (Score:2)
    
    by PPH ( 736903 ) writes:
    
    7 hours? Summary said "corporate work day". So, start at 10:00AM, break at noon for a three martini lunch and then hit the golf course.
Doing what? (Score:3)

by gweihir ( 88907 ) writes: on Thursday May 22, 2025 @02:43PM (#65396463)

Because producing hallucinations for 7 hours/day is pretty easy...

- Re: (Score:2, Insightful)
  
  by Moridineas ( 213502 ) writes:
  
  Given how much of your time you seem to spend here posting about LLMs and the repetitive nature of your posts, I'm starting to think you're one of them!
  - - Re:Doing what? (Score:5, Funny)
      
      by gweihir ( 88907 ) writes: on Thursday May 22, 2025 @04:23PM (#65396867)
      
      I am not a developer. This is just something I can _also_ do.
      
    - Re: (Score:2)
      
      by null etc. ( 524767 ) writes:
      
      He's still gonna be ranting even after LLM-powered robots capture and imprison him.
      "What, you think you can control me?!?! You're just a glorified auto-complete!"
  - Re:Doing what? (Score:4, Insightful)
    
    by Plugh ( 27537 ) writes: on Thursday May 22, 2025 @05:06PM (#65396979) Homepage
    
    It is the new antivaxx. People who know a little more than the average non-techie person based ofc on mostly secondhand sources amp up the scare factor and get positive feedback in terms of clicks & attention and -- if they graduate to grifting -- ads. Yeah it is all a con stochastic parrots biggest bubble since tulips scorch the planet yadda. Meanwhile materials scientists using it to make better solar panels, plasma physicists using it to enable fusion, medical science unlocking the proteome for personal designer therapies.
    People should spend their energy learning how the tools work, and how to use them well.
    
The upcoming arms race is obvious. (Score:5, Insightful)

by Tschaine ( 10502969 ) writes: on Thursday May 22, 2025 @02:56PM (#65396511)

When most employees are producing multiple times the written output that they could produce on their own, everyone will need AI agents to summarize all of the documents, email, and slack/teams messages that are coming at them.
I'm not at all convinced that this will be better than communicating without the AI-powered inflation and summarization in between the humans.
In fact, this seems much more likely to introduce errors (and lose nuances) than plain old person to person communication.

- Re: (Score:1)
  
  by blue trane ( 110704 ) writes:
  
  What if one of the persons is weird, how can you communicate with that in the workplace? What if one of the persons is flirting?
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  Is it really "working" all day when you don't do what you're being told to do? Enquiring minds want to know!
- Re: (Score:2)
  
  by ScienceBard ( 4995157 ) writes:
  
  When most employees are producing multiple times the written output that they could produce on their own, everyone will need AI agents to summarize all of the documents, email, and slack/teams messages that are coming at them.
  I'm not at all convinced that this will be better than communicating without the AI-powered inflation and summarization in between the humans.
  In fact, this seems much more likely to introduce errors (and lose nuances) than plain old person to person communication.
  There's actually one step further than this that I enjoy thinking about. All these big tech firms had erected moats of technical complexity around themselves, it was part of the logic behind paying for developers you didn't need just to keep the away from "the other guy." There's two possibilities out of this AI hype: either it's not real, and they're squandering huge amounts of resources, or it is real and they're now entirely exposed. If, for a few hundred dollars, I can hand requirements to an LLM in pla
I believe it (Score:2)

by backslashdot ( 95548 ) writes:

I've seen vibe coding tools taking hours attempting to fix a bug it generated itself until it depletes of credits entirely.
- Re: (Score:2, Insightful)
  
  by dinfinity ( 2300094 ) writes:
  
  I've seen humans taking days to do that. Those hours cost a lot more than the AI model credits too.
7 hours!?! (Score:1)

by DBCubix ( 1027232 ) writes:

I barely got 30 minutes out of it before the usage limit was reached. I still have another 15 minutes to wait until the limits reset.
Claude.. (Score:3)

by Z80a ( 971949 ) writes: on Thursday May 22, 2025 @06:26PM (#65397227)

Here is the documentation, and development tools of the Sega saturn.
Please do deliver the closest you can of the game crysis using these tools, i expect the end result to be an ISO file that can be ran on the real hardware.
Also i do expect all the CPUs to be in full use in benefical ways to the overall graphical, audio and gameplay quality.

- Re: (Score:2)
  
  by chas.williams ( 6256556 ) writes:
  
  How about a nice linked list instead?
after using for 10 minutes, (Score:1)

by iplayfast ( 166447 ) writes:

It says my time is up, will restart in 6 hours. Work for nearly a full work day? I think not!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Anthropic Releases Claude 4 Models That Can Autonomously Work For Nearly a Full Corporate Workday (anthropic.com) 26

Anthropic Releases Claude 4 Models That Can Autonomously Work For Nearly a Full Corporate Workday More Login

Anthropic Releases Claude 4 Models That Can Autonomously Work For Nearly a Full Corporate Workday

Re: huh? (Score:3)

Re: (Score:2)

Re: huh? (Score:2)

I want to know (Score:5, Funny)

Re: (Score:3)

Re: (Score:2)

Doing what? (Score:3)

Re: (Score:2, Insightful)

Re:Doing what? (Score:5, Funny)

Re: (Score:2)

Re:Doing what? (Score:4, Insightful)

The upcoming arms race is obvious. (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

I believe it (Score:2)

Re: (Score:2, Insightful)

7 hours!?! (Score:1)

Claude.. (Score:3)

Re: (Score:2)

after using for 10 minutes, (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot