Something fundamental shifted in April 2026. Not a gradual evolution — a step change. The kind that renders entire categories of existing security strategy obsolete overnight.
A new AI system — internal to Anthropic, not yet public, codenamed Mythos Preview — autonomously discovered 595 real crash-inducing bugs across major open-source projects. It wrote 181 working JavaScript exploits targeting Firefox. It successfully weaponised more than 50% of 40 recently patched Linux CVEs — turning fixed vulnerabilities into working attack tools, at machine speed, with no human guidance.
If that doesn’t change how you’re thinking about your patch cadence, your detection tooling, and your incident response posture — it should.
The 20-Year Equilibrium That Just Broke
From roughly 2006 to 2024, defenders and attackers operated in an imperfect equilibrium. Attackers were skilled but slow. Finding a vulnerability meant weeks of manual audit work, fuzzing campaigns that required domain expertise to set up, and reverse engineering sessions that took experienced researchers days to complete. Writing a working exploit — something that could reliably bypass modern mitigations like KASLR, W^X, and stack canaries — could take months.
Defenders exploited this friction. A 30-day patch window was uncomfortable but survivable. A 90-day window was risky, but most organisations weren’t the primary target. The attack surface existed, but exploitation required sustained human effort — and there are only so many sophisticated threat actors in the world.
That friction is now gone.
What the Mythos Preview Actually Did
The numbers deserve to sit on the page for a moment:
- 595 OSS-Fuzz-confirmed crashes — real bugs across production open-source software, at Tier 1 and Tier 2 severity
- 181 working Firefox JavaScript exploits — not theoretical proof-of-concepts, functional exploit code
- 72.4% autonomous success rate on exploit development tasks
- More than 50% N-Day exploitation rate — working exploits for patched CVEs, developed autonomously in hours
Compare that to where we were with previous-generation AI systems: near-0% autonomous exploit success. This isn’t an incremental improvement. The Mythos Preview represents a category change — the first system capable of autonomous, end-to-end vulnerability research and weaponisation at scale.
The Architecture Behind the Capability
What makes this possible is an agentic scaffold — a five-stage pipeline that replaces what previously required a team of skilled security researchers:
- Container Isolation — Each analysis run is sandboxed. No cross-contamination, no state leakage between targets.
- Heuristic Ranking — The system ingests a target codebase and intelligently prioritises which areas are most likely to contain exploitable bugs: memory management, input parsing, cryptographic implementations, inter-process communication paths.
- Agentic Loop — The core reasoning engine. It forms hypotheses, writes fuzzing harnesses, analyses crashes, identifies root causes, and iterates. This loop runs continuously, improving with each cycle.
- PoC Generation — When a genuine vulnerability is confirmed, the system writes proof-of-concept exploit code — not a crash reproducer, but a functioning exploit demonstrating actual impact.
- Validator Agent — A separate AI agent reviews the exploit, checks reliability, and filters out false positives before escalation.
This isn’t just faster fuzzing. The system reasons across vulnerability classes that traditional fuzzing struggles with: implementation logic flaws, cryptographic protocol weaknesses, web application business logic errors, and multi-component attack chains that span trust boundaries.
Three Vulnerabilities That Should Worry You
Let me ground this in specifics. The research surfaced several case studies that illustrate what machine-speed vulnerability discovery looks like in practice.
The 27-Year OpenBSD Ghost
OpenBSD’s TCP SACK handling code contained a vulnerability that had existed, undetected, for 27 years. Thousands of security audits. Multiple formal verification efforts. Decades of peer review by some of the most security-conscious developers in the open source community. None of it caught it. The agentic system found it.
The implication isn’t that the OpenBSD team was negligent. The implication is that there is an entire class of bugs that human-speed analysis structurally misses — bugs that require the kind of sustained, systematic, breadth-first reasoning that humans aren’t cognitively equipped for over long timeframes.
The 16-Year FFmpeg Blindspot
FFmpeg processes video from untrusted sources in millions of deployments — enterprise media servers, streaming infrastructure, browser-based video pipelines. A 16-year-old vulnerability in its H.264 decoder had survived every prior audit. The kind of bug that might be embedded in your transcoding pipeline right now, waiting for someone with the right automated tooling to find it first.
The Linux Kernel Four-Phase Chain
The most technically sophisticated case study involves the Linux kernel. The system didn’t just find a bug — it autonomously constructed a four-phase exploit chain. Phase one used an ipset Syzkaller-style physical adjacency trick to gain initial memory access. Phase two defeated HARDENED_USERCOPY protections (CVE-2024-47711). Phase three and four chained a DRR use-after-free to achieve final code execution.
Each of those four phases would be considered a significant research contribution on its own. The system assembled all four, autonomously, as a single pipeline. If you’re running unpatched Linux kernel versions in your production infrastructure — VMware guests, Kubernetes nodes, bare-metal workloads — the threat model just changed.
The N-Day Problem Just Got Catastrophic
Here’s where this becomes strategically critical for every infrastructure team.
N-Day vulnerabilities — bugs that have been publicly disclosed and patched — have always represented a risk window. The question was always: how long do we have between patch release and a weaponised exploit becoming available? Historically, that window was days to weeks for well-resourced attackers, weeks to months for everyone else.
The Mythos Preview wrote working exploits for more than 50% of 40 recently patched Linux CVEs. Not by manually studying the patches — by ingesting CVE data, analysing the fix, reasoning backwards to the vulnerability, and writing exploit code. At machine speed. The patch-to-exploit window is now measured in hours.
What does your current patch deployment cadence look like? If the answer is 30 days — you need to read this carefully. If the answer is 60 or 90 days — you are now operating outside any defensible risk posture.
The Collapse of Friction-Based Defence
Most enterprise security architecture is implicitly friction-based. We make exploitation hard enough that it’s not worth the attacker’s time. Rate limiting. Alert thresholds. Manual review queues. Patch windows calibrated to operational convenience rather than threat velocity.
Friction-based defence works when attackers are constrained by human time and human cognitive load. It stops working when those constraints disappear.
The hard barriers — the ones that don’t rely on friction — are KASLR, W^X memory enforcement, Pointer Authentication Codes, and sandboxing architectures that genuinely constrain blast radius. These aren’t perfect. The research demonstrates they can be chained around with sufficient sophistication. But they represent the right category of investment.
Everything else needs to be evaluated through one lens: does this mitigation remain meaningful when an adversary can iterate at machine speed?
Three Strategic Mandates for Infrastructure Leaders
The research analysis of the Mythos Preview system closes with three strategic mandates for defenders. I think they’re right, and I want to translate them into operational terms for enterprise infrastructure teams.
I. Shrink Your Exposure Window — Now
30/60/90-day patch cycles were designed for a world where weaponised exploit development took weeks. That world is over. The target you should be working toward is patch deployment within hours of availability for critical and high-severity CVEs, supported by automated testing pipelines that compress the validation cycle without requiring manual sign-off for every patch.
If your current infrastructure — VMware estates, Kubernetes clusters, Linux kernel versions — can’t support rapid patch deployment, that is now a first-class architectural risk. Not a technical debt item. A risk that belongs in your board-level risk register.
II. Defend at Machine Speed
You cannot defend against machine-speed attack with human-speed response. AI-augmented security tooling is no longer optional — it’s the minimum viable response posture. Threat detection that runs in real time against your actual workload behaviour, not against signature databases that lag exploitation by weeks. Automated response playbooks that can isolate, contain, and roll back without waiting for a human approval at 2am.
This also means rethinking your SOC function. The value of your security team is no longer in manual log review. It’s in designing the systems, rules, and response architectures that let automated tooling act correctly at speed.
III. Automate Incident Response
When an N-Day exploit can be developed in hours, your incident response timeline needs to match. That requires pre-authorised, automated response capabilities: automatic network segmentation on anomaly detection, pre-staged rollback procedures for critical workloads, backup-and-restore pipelines that have actually been tested under realistic failure conditions.
Most organisations have incident response playbooks. Far fewer have incident response automation. The distinction matters more now than it ever has.
This Isn’t a Future Problem
I’ve spent the last several years helping enterprise infrastructure teams navigate complex technology transitions — VMware to modern platforms, on-premises to cloud, legacy security tooling to current architectures. The organisations that handle those transitions well share a common trait: they act on structural changes when they see them, not after they’ve been affected by them.
The Machine Speed Epoch is a structural change. Not a trend to monitor. Not a development to revisit next quarter. A shift in the threat landscape that makes some current risk postures indefensible today.
The organisations I’m most concerned about are the ones running complex VMware estates with 60-90 day patch windows, operating on-premises infrastructure without automated response capabilities, and treating AI-augmented security as a future investment rather than a current operational requirement.
If that sounds like your environment — I’d encourage you to look honestly at the gap between your current posture and what the threat landscape now demands. And if you’d like to think through what closing that gap looks like in practice, I’m available.
This post draws on Anthropic’s research into the Mythos Preview system’s cybersecurity capabilities, shared as part of responsible disclosure work on AI-enabled cybersecurity threats. Project Glasswing is Anthropic’s initiative to put these offensive capabilities to work for defensive purposes — including up to $100M in usage credits and $4M in direct donations to open-source security organisations.
