On June 10, Anthropic unveiled two high-performance AI models, Claude 5 and Mythos 5, which are expected to disrupt the AI model industry. The introduction of these models intensifies pressure on OpenAI and Google regarding their upcoming model roadmaps.
Anthropic established a new performance tier called the "Mythos Class," adding it above the existing top-tier model, Opus. Both Claude 5 and Mythos 5 share the same foundational model but differ in access controls. Anthropic determined that the capabilities of these models could pose potential risks in cybersecurity and biology if released publicly, leading to a dual-release strategy.
Claude 5 features a safety classifier that routes cybersecurity and biology-related queries through Opus 4.8, making it available for general use. In contrast, Mythos 5, which has this safeguard disabled, is restricted to partners in cybersecurity defense and infrastructure, marking the first commercial service phase for the previously surprising Mythos model. Both models support a context window of 1 million tokens and a maximum output of 128,000 tokens, with pricing set at $10 per million tokens for input and $50 per million tokens for output, double the rates of Opus 4.8.
Benchmark performance significantly surpasses competing models. In the SWE-Bench Pro, a key indicator in software engineering, Claude 5 scored 80.3%, well above Opus 4.8's 69.2%, GPT-5.5's 58.6%, and Gemini 3.1 Pro's 54.2%. In the more challenging Frontier Code Diamond set, it achieved 29.3%, more than double Opus 4.8's 13.4%. In the GDPval-AA knowledge work assessment, it ranked first with an Elo score of 1932. Anthropic emphasizes that "the longer and more complex the task, the greater the gap compared to competing models."
Real-world performance is also noteworthy. In initial tests by Stripe, Claude 5 completed a migration of a codebase with 50 million lines in just one day. External researcher Matthew Pines reported that in his frontier physics research project, Claude 5 matched a task that took GPT-5.5 four days in just 36 hours.
The launch of these models has made the pressure on OpenAI and Google more apparent. On April 23, OpenAI released the previously anticipated GPT-6, codenamed "Spurd," as GPT-5.5, effectively delaying the GPT-6 schedule and opting for a mid-version response to the market. However, GPT-5.5's SWE-Bench Pro performance lags 21.7 percentage points behind Claude 5. As of now, no official release date for GPT-6 has been announced.
Google is in a similar position. At the Google I/O event on May 19, it showcased Gemini 3.5 Flash and is currently rolling out Gemini Omni, but industry experts assess that Gemini Omni's performance is on par with GPT-5.5 and does not reach the levels of the Mythos Class.
The competition for IPO funding is also a factor. With both OpenAI and Anthropic pursuing public listings, the launch of Anthropic's high-performance models is likely to attract investor interest.
Industry analysts suggest that the arrival of Claude 5 and Mythos 5 goes beyond mere model performance updates, resetting the benchmarks within the AI sector. Anthropic's strategy of creating a new tier above Opus is forcing competitors to thoroughly reassess their roadmaps.
* This article has been translated by AI.
Copyright ⓒ Aju Press All rights reserved.