1 May 2026

Mythos, Models, and the Myth of the Tipping Point

AI is compressing the time between vulnerability discovery and exploitation. Here's what Claude Mythos, Copy Fail, and recent npm attacks mean for defenders.

Claude Mythos matters. But is this the transformational change many are claiming, or an acceleration of trends already under way?

Our view: it is more the latter.

Beyond the marketing hype, this does look like a capability jump. The UK AI Security Institute (AISI) evaluated Claude Mythos Preview and concluded that it represented a step up over previous frontier models, including becoming the first model they tested to complete a corporate network attack simulation end-to-end [1]. But the important point is that this was not a one-off. AISI's later evaluation of OpenAI's GPT-5.5 found that it reached a similar level of performance, becoming the second model to complete one of their multi-step cyber-attack simulations end-to-end [2]. AISI's own conclusion was that this points to a broader trend, not just a single exceptional model.

Copy Fail makes that trend feel much less theoretical. CVE-2026-31431 is a Linux local privilege escalation affecting mainstream Linux distributions since 2017 [3]. The public write-up says it was surfaced by Xint Code after around an hour of AI-assisted scan time against the Linux crypto/ subsystem, with one operator prompt and no harnessing [4]. The result was a reliable privilege-escalation primitive where an unprivileged local user could gain root on affected systems.

That is the shift. Attackers still do the same three things: find vulnerabilities, build exploits, and use them. What changes is the cost, skill threshold, and time between each step. If models can help compress vulnerability discovery from months of expert research into hours of assisted scanning, the defender's problem changes. Patch cycles, vulnerability management, and exposure windows were built for a world where high-grade exploit discovery was scarce. That assumption is starting to weaken.

Mythos is not the moment cyber suddenly changed. The change has been happening for years, and the speed of change has been increasing. Copy Fail shows what that looks like in the real world: not a new class of attacker behaviour, but a faster route through the same attack chain. Tooling has been compressing those timelines for years: Nmap, Metasploit, Armitage & Cobalt Strike. Each one made a specialist task easier to repeat and scale. AI does the same thing. It isn't new. It's just faster.

One data point: Rapid7 reports that the median time from disclosure to a vulnerability appearing in CISA's known-exploited list has dropped from 8.5 days to 5, and mean time-to-exploit has fallen from 61 days to 28.5 [5]. Mythos didn't start that trend, but it could accelerate it.

So what does AI change in practice?

At Claranet, we've been using AI in penetration testing and Continuous Security Testing for a while. It helps us move quicker through the mechanical parts of the work. The judgement still sits with the tester.

These tools are good at scanning code and surfacing potential issues. The hard part is deciding what matters for your environment. Anthropic says that over 99% of the vulnerabilities Mythos has found are still unpatched while disclosure is coordinated [6]. That's a lot of output to sort through.

We should be careful not to overstate this. Some of the reporting is still thin, and some of the claims will need to stand up over time. However, we know from years of testing that finding a vulnerability is only part of the process, validating whether it's really an issue is key, and then prioritising the list effectively is essential if action is to be targeted to best effect.

If you run security in a real organisation, this probably feels familiar already. Even after verification and prioritisation, the issue usually isn't whether a patch exists. It's whether you can safely find it, test it, approve it, and deploy it without breaking production.

We don't yet know how many of the high and critical findings Mythos reports are truly exploitable in real deployments, versus sitting in dead code paths. Context still matters, and most code analysis struggles to capture it.

The OpenBSD finding

Mythos found a 27-year-old bug in OpenBSD that crashes any host responding over TCP [7]. Everyone focused on the AI angle. Nobody asked why the bug was still there.

The economics behind vulnerability discovery are increasingly influential in shaping who participates in the process. As Mythos demonstrated with its OpenBSD finding, the cost of running large-scale AI-driven campaigns can be substantial: Anthropic's full campaign was about $20,000, while the specific OpenBSD bug was uncovered for less than $50 [7]. This cost dynamic shifts the incentive structure: defenders with commercial products to protect, state actors pursuing strategic objectives, and criminal groups seeking profitable exploits are the ones most likely to invest in these tools at scale. Their willingness to pay is driven by clear return on investment, whether that's securing assets, advancing national interests, or monetising vulnerabilities. In contrast, lone researchers or hobbyists rarely have the financial backing to conduct such extensive scans, especially on less lucrative targets like OpenBSD where there's no bug bounty scheme. The market is being shaped by those who can justify the spend, rather than those motivated purely by curiosity or community contribution.

The bar has been rising at pace

No question, Mythos is a step up. On Mozilla's Firefox 147 JavaScript engine, Mythos produced 181 working exploits across several hundred attempts, compared to just 2 from Claude Opus 4.6 on the same benchmark [6].

However, the bar has been rising for years, and patching in most organisations is still slow. Edgescan puts mean time to remediate critical severity vulnerabilities at around 60–65 days, and that figure has barely shifted [8]. Exploitation timelines are now measured in days. That's where most organisations are still struggling.

But patching might be the problem

On 8 September 2025, attackers phished an npm maintainer account and pushed malicious versions of 18 widely used JavaScript packages including chalk and debug [9]. Those packages see over 2.6 billion downloads per week. The compromised versions were live for about two hours [10]. If your pipeline was set to pull the latest version automatically, you were exposed before anyone knew there was an issue.

On 31 March 2026, two malicious versions of Axios were published to npm. Axios has over 70 million weekly downloads. It was attributed to a North Korean state actor [11]. Because Axios is commonly auto-updated, any project pulling the latest version automatically connected to attacker infrastructure and downloaded a remote access trojan. The attack didn't modify Axios itself; it added a new dependency with the payload buried inside it.

Patch slowly and you're exposed to the vulnerability. Patch automatically and you might pull in the compromise.

There isn't an easy answer here.

So, what should IT and security leaders do?

Re-iterate the basics. Know your attack surface. Not only do you need to know what you have, but what technology it's running, who owns it and what part it plays in your business. Once you're clear on the scope, turn up the rate of vulnerability scanning. This is how you find the patches you've missed, the old router, the forgotten firewall, the system nobody owns any more.

Get ruthless about triage and process. As tools like Mythos scale, you'll see more findings and more patches. Make it easy to ingest, prioritise and apply changes without turning every update into a fire drill. In practice, the hardest part is rarely finding the issue. It's deciding what matters, getting the right people to act, and doing it without causing a different outage.

Know your compliance floor. Cyber Essentials, PCI DSS, ISO 27001, NHS DSPT, PSN, CBEST, each has its own patching expectations. Map yours. Meet them. Then go beyond where the risk justifies it. Cyber Essentials already requires patching of high and critical vulnerabilities within 14 days [12]. PCI DSS requirement 6.3.3 requires patching of critical vulnerabilities within 30 days [13].

Be pragmatic about patching the supply chain. Where you can safely auto-patch, do. In many real environments you can't, legacy systems, regulated change control, uptime requirements & software that breaks when anything moves. Where auto-patching isn't viable, document why and invest in the processes that shorten your manual window. Pin versions in production. Validate before shipping. Build a software bill of materials so you know what you're actually running. The discipline isn't "patch fast", it's "patch fast with validation."

The real story

The organisations that handle any change in the cybersecurity landscape have the basics well set. They'll treat patching as an operational function, keep visibility into their dependency tree, and assume the software supply chain deserves as much scrutiny as the perimeter.

Mythos makes an already-fast problem faster. Most organisations are still running vulnerability management and patch governance on timelines that were "fine" a decade ago. They aren't fine now.

If you're responsible for vulnerability management, patch governance or security testing, the practical question is simple: do you have a foundation you can accelerate from, in order to face a problem that's only getting faster?

References

  1. UK AI Security Institute, "Our evaluation of Claude Mythos Preview's cyber capabilities." https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities
  2. UK AI Security Institute, "Our evaluation of OpenAI's GPT-5.5 cyber capabilities." https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities
  3. Tenable, "Copy Fail (CVE-2026-31431): Linux Kernel Privilege Escalation FAQ." https://www.tenable.com/blog/copy-fail-cve-2026-31431-frequently-asked-questions-about-linux-kernel-privilege-escalation
  4. Theori / Xint, "Copy Fail: 732 Bytes to Root on Every Major Linux Distribution." https://xint.io/blog/copy-fail-linux-distributions
  5. Rapid7, "2026 Global Threat Landscape Report." https://www.rapid7.com/research/report/global-threat-landscape-report-2026/
  6. VentureBeat, "Mythos autonomously exploited vulnerabilities that survived 27 years of human review." https://venturebeat.com/security/mythos-detection-ceiling-security-teams-new-playbook
  7. AI2Work, "Anthropic's Claude Mythos Uncovers a 27-Year-Old OpenBSD Bug." https://ai2.work/blog/anthropic-s-claude-mythos-uncovers-a-27-year-old-openbsd-bug
  8. Edgescan, "2025 Vulnerability Statistics Report." https://www.edgescan.com/stats-report/
  9. Aikido Security, "npm debug and chalk packages compromised." https://www.aikido.dev/blog/npm-debug-and-chalk-packages-compromised
  10. Palo Alto Networks, "Breakdown: Widespread npm Supply Chain Attack Puts Billions of Weekly Downloads at Risk." https://www.paloaltonetworks.com/blog/cloud-security/npm-supply-chain-attack/
  11. Microsoft Security, "Mitigating the Axios npm supply chain compromise." https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/
  12. Claranet, "2026 changes to Cyber Essentials and Cyber Essentials Plus – what you need to know." https://www.claranet.com/uk/blog/2026-changes-to-cyber-essentials-and-cyber-essentials-plus-what-you-need-to-know/
  13. TrustedSec, "PCI DSS Vulnerability Management: The Most Misunderstood Requirement." https://trustedsec.com/blog/pci-dss-vulnerability-management-the-most-misunderstood-requirement-part-3