AISI Study Shows GPT-5.5 and Claude Mythos are Equal in Cybersecurity Performance

GPT-5.5 achieved a 71.4% success rate in expert-level security tasks, outperforming Claude Mythos Preview, hinting at AI's growing prowess in cyberattack simulations. What could this mean for cybersecurity?

Editorial Staff

4 hours ago 1 min read

Recent evaluations by the UK AI Security Institute (AISI) reveal that GPT-5.5 has achieved notable results in cyberattack simulations, positioning it among the top AI models currently tested. This model successfully completed a complex, multi-stage attack simulation, demonstrating advanced capabilities in cybersecurity tasks.

In assessments involving 95 capture-the-flag tasks across varying difficulty levels, GPT-5.5 recorded an impressive average success rate of 71.4 percent at the highest "Expert" difficulty. In comparison, Claude Mythos Preview achieved a success rate of 68.6 percent, highlighting a competitive edge for GPT-5.5. Both models were tested against previous iterations, with GPT-5.4 and Claude Opus 4.7 scoring significantly lower at 52.4 percent and 48.6 percent, respectively.

The AISI's testing included a challenging simulation titled “The Last Ones,” which comprises 32 steps across multiple network segments. GPT-5.5 managed to complete this simulation successfully in 2 out of 10 attempts, while Claude Mythos Preview succeeded 3 out of 10 times. These findings indicate a growing potential for AI models in executing complex cybersecurity operations, although the absence of active defenders during testing raises concerns regarding their effectiveness in real-world scenarios.

Related Articles

Ferrari Fans Gain New AI Tools as IBM Enhances Scuderia App Experience

Next-Gen Consoles Propel AMD's Revenue Surge, Sparking Stock Rally Above 60%

Struggling Businesses Gain Edge with X's Revamped AI Advertising Platform

AI's ER Diagnosis Breakthrough Raises Concerns Over Doctor-Patient Trust

Chargebacks911's New Tools Aim to Protect Merchants from Revenue Loss in Agentic Commerce

Apple's Surge in R&D Spending Signals Major AI Advancements Ahead

Mistral AI's New Workflows Set to Transform Enterprise Automation Efficiency

AI Integration in U.S. Diplomacy Promises Better Decision-Making for Global Relations

AI House and UW's Center for an Informed Public Set to Inspire at GeekWire Awards

Share article