OpenAI Unveils GPT-5.5, Pushing Agent-Style AI and ‘Super App’ Strategy

OpenAI has unveiled its latest artificial intelligence model, GPT-5.5, sharpening competition in generative AI and signaling a shift beyond incremental performance gains toward agent-style AI and a broader “super app” strategy.

According to OpenAI on April 23 (local time), GPT-5.5 features markedly improved reasoning and autonomy. The company said capabilities for complex work such as coding, research and data analysis have advanced, with a key focus on “agent-style” task execution that can interpret a user’s intent and map out steps to solve a problem.

OpenAI described GPT-5.5 as a core engine for its “AI super app” approach. The model can recognize what is on a screen and carry out computer actions such as clicking and typing, moving across tools in ways the company says bring it closer to collaborating with humans. OpenAI said this would push ChatGPT beyond a chat window into an integrated platform combining work, search and productivity tools.

As an example, the company said that if a user asks it to analyze recent market trends, write a report and send it by email, GPT-5.5 can open a browser to gather information, draft the document and operate an email client to complete the task. The company framed this as a move from providing answers to acting as an executor of work.

OpenAI also said it strengthened security guardrails to match the model’s capabilities, judging GPT-5.5 to fall into a “high-risk” category because of potential misuse such as cyberattacks. It said the model underwent an unprecedented level of red-team testing, or simulated attacks.

The release comes as rival Anthropic has rolled out next-generation models including “Claude Mythos” and “Opus 4.7,” intensifying competition. In a report OpenAI released, GPT-5.5 outperformed Opus 4.7 on several key benchmarks.

On “GDPval,” a measure of practical task performance, GPT-5.5 scored 84.9%, about 4 percentage points higher than Opus 4.7. On “Terminal-Bench 2.0,” which evaluates system-control capability, it scored 82.7%, about 13 points higher. On CyberGym, a security-related metric, it scored 81.8%, well above a competing model’s 73.1%.

However, on SWE-Bench Pro, a coding benchmark, GPT-5.5 scored 58.6%, trailing Opus 4.7’s 64.3%. OpenAI also raised concerns about possible data memorization by the competing model and said it disagreed with aspects of the evaluation approach.

OpenAI said GPT-5.5 will be integrated across its major services, including ChatGPT, accelerating its push toward an “AI super app” that handles a range of tasks on a single platform. As generative AI moves from a support tool to what the company described as a “digital colleague,” OpenAI said broad changes in productivity across companies and industries are expected.

At a news conference, OpenAI President Greg Brockman said, “The biggest feature of this model is that it can do more with less instruction.” He added, “Its ability to interpret incomplete or ambiguous problems on its own and decide the next steps has improved significantly.” Brockman said it was “an important advance that will form the foundation for how computers are used in the future and for large-scale agent-style computation.”

* This article has been translated by AI.