The Problem Anthropic Couldn't Ignore
When Anthropic tested Claude Mythos Preview internally, they ran into a problem that no AI company had faced quite like this before.
The model was extraordinarily good at finding vulnerabilities in software. Not just "better than previous Claude models" good — better than most human security researchers good. In controlled evaluations, Mythos identified thousands of zero-day vulnerabilities across every major operating system and browser, some of them undetected for over two decades.
That created an impossible release question: how do you make a model that powerful publicly available without handing nation-state-level offensive capability to anyone with an API key?
Anthropic's answer was Project Glasswing — and it's unlike anything a frontier AI lab has announced before.
What Project Glasswing Actually Is
Project Glasswing is a coalition-based initiative in which Anthropic is deploying Claude Mythos Preview defensively — to find and fix vulnerabilities in the world's most critical software — before that capability becomes available to attackers.
The logic is straightforward: if Mythos can find zero-days at scale, so can a malicious actor with access to the same model. The only way to responsibly release a model this capable is to ensure that the vulnerabilities it would be used to exploit have already been patched. Glasswing is the mechanism to do that.
Anthropic is providing access to over 50 organizations — with $100 million in usage credits — to run Mythos against their own systems and find vulnerabilities before the model becomes more broadly available.
Who's In the Coalition
The confirmed Project Glasswing partners represent the companies responsible for the largest share of the world's shared cyberattack surface:
Cloud and Infrastructure:
- Amazon Web Services
- Microsoft
Hardware and Platform:
- Apple
- Broadcom
- Nvidia
Security:
- CrowdStrike
- Cisco
Financial Services:
- JPMorganChase
Plus approximately 40 additional organizations responsible for maintaining critical software infrastructure — the open-source projects, OS components, and network libraries that underpin essentially all modern software.
The composition of this list is significant. Anthropic didn't just invite the biggest companies — they invited the companies whose software, if compromised, could affect billions of users. AWS infrastructure, Apple's iOS, Microsoft's Windows kernel, Google's Chrome — these are the attack surfaces that matter most.
What These Organizations Are Actually Doing
Project Glasswing partners receive access to Claude Mythos Preview to run against their own systems under a structured security research protocol.
The workflow, in simplified terms:
- The partner organization provides Mythos with their codebase, binary, or system access (in a sandboxed environment)
- Mythos performs autonomous vulnerability research — static analysis, dynamic testing, binary reverse engineering
- Discovered vulnerabilities are reported to the partner's security team
- The partner patches before coordinated disclosure
For closed-source software (which applies to Apple, Microsoft, and many others), Mythos's binary reverse engineering capability is what makes this viable. The model doesn't need source code. It can reconstruct meaningful program logic from stripped compiled binaries — a capability that previously required specialist expertise that didn't scale.
The Numbers Behind the Discovery
Anthropic's pre-announcement internal testing with Mythos produced numbers that the security community is still absorbing:
- Thousands of zero-day vulnerabilities identified across every major OS and browser
- Vulnerabilities ranging in age from recent to 27 years old (a bug in OpenBSD that had been present since 1999)
- In offensive benchmark testing: 595 exploitable crashes at the most common vulnerability tiers, and full control-flow hijack on 10 separate fully-patched targets — the highest severity tier
- SWE-bench Verified score of 93.9% (general software engineering proxy)
- CyberGym score of 83.1% (security-specific benchmark)
The 27-year-old OpenBSD bug is the detail that keeps security researchers up at night. A vulnerability that survived decades of expert human auditing, automated fuzzing, and static analysis tools — found by an AI model in what is presumably hours or days of compute time. That's not an incremental improvement. That's a regime change.
Why This Is Bigger Than a Patch Tuesday
Traditional vulnerability disclosure follows a well-understood cycle: researcher finds bug, contacts vendor, vendor patches, coordinated public disclosure. Project Glasswing is different in three important ways.
Scale. Human security researchers find vulnerabilities one at a time. Mythos finds them by the thousands, in parallel, across every major software platform simultaneously. The throughput is categorically different.
Coverage. Human researchers specialize. Mythos doesn't — it can context-switch between an iOS kernel, a Chrome JavaScript engine, and a Windows networking stack without losing capability. There are no gaps that fall between specializations.
Access to closed-source systems. The most consequential vulnerabilities are often in proprietary software that human external researchers can't audit. Mythos's binary analysis capability changes this — you no longer need source code access to find vulnerabilities at depth.
The implication is that the months following Project Glasswing will see an unprecedented wave of security patches across the industry. Many of the vulnerabilities being patched are bugs that have been present for years, sitting quietly while attackers who happen to find them have had free reign.
The Harder Question: What Happens After Glasswing?
Anthropic has been transparent that Glasswing is a precondition for broader release, not a permanent restriction. The initiative exists to reduce the offensive harm potential of a more widely available Mythos by ensuring its most exploitable findings are already patched.
But this raises a question the security community is actively debating: will it be enough?
The argument that it won't: Zero-days exist in a long tail. You can't patch everything before releasing a model that finds everything. Some vulnerabilities will be discovered by Mythos-level models at the same time defenders are trying to patch Glasswing discoveries. The race doesn't end.
The argument that it will: The most critical, highest-severity vulnerabilities — the ones that allow full system compromise — are disproportionately represented in major platforms (Windows, iOS, Chrome, Linux). Glasswing concentrates effort exactly there. Patching the worst 20% of vulnerabilities eliminates the majority of the real-world risk.
Both arguments are probably right. The window of safety isn't infinite — but it doesn't need to be infinite. It needs to be long enough for the most critical patches to deploy at scale.
What This Means for Startups and Tech Companies
If your product depends on any of the platforms in the Glasswing coalition — and it almost certainly does — you will see an unusual number of security patches from those platforms in the coming months. Patch aggressively. The fixes coming out of Glasswing are patching vulnerabilities that have been present, potentially exploited, and hidden for years.
If you're building security tooling or are in an industry with significant security compliance requirements (fintech, healthtech, legal tech), Claude Mythos's capabilities represent the new threat model you're building against. Attackers who develop or obtain access to Mythos-level models will be operating at a scale and sophistication previously only attributable to nation-state actors.
And if you're building AI products: the agentic, multi-step reasoning capabilities that make Mythos exceptional at security research are the same capabilities that will transform other industries. The gap between what's in Glasswing today and what's in your product's AI stack tomorrow is closing faster than most roadmaps account for.
The Glasswing Bet
Project Glasswing is, at its core, a bet: that using the most powerful AI capability defensively first, systematically, at scale, can create a window of safety wide enough to be meaningful.
It's a more sophisticated approach to AI safety than anything the industry has attempted at this level. Whether it works depends on factors — patch deployment rates, adversary capabilities, coordination across the coalition — that no one can fully control.
But the alternative — releasing Mythos without attempting Glasswing — would have been indefensible. Whatever the outcome, Anthropic has established a precedent for how responsible frontier capability deployment can work.
Talk to SynCube about building AI-powered products on the right foundation for what's coming next.
SynCube is an AI development company and software house specializing in AI development, SaaS MVPs, and scalable web applications for startups worldwide.


