Claude Mythos

Post by **Zema Bus** » Sat Apr 11, 2026 7:48 am

An Anthropic engineer with zero security training asked Claude Mythos to find remote code execution bugs overnight. He woke up to a complete working exploit.

That’s the kind of model Anthropic announced on April 7. Claude Mythos Preview is, by every published benchmark, the most capable AI model ever built. It scores 93.9% on SWE-bench Verified, 97.6% on the USAMO math olympiad, and 83.1% on CyberGym. It found zero-day vulnerabilities in every major operating system and every major web browser. Fully autonomously. No human guidance needed.

Anthropic’s response to building it: don’t release it. Instead, the company launched Project Glasswing, a cybersecurity defense initiative that gives the model to Amazon, Apple, Google, Microsoft, Nvidia, CrowdStrike, JPMorgan Chase, Cisco, Broadcom, Palo Alto Networks, and the Linux Foundation. About 40 additional organizations that maintain critical software infrastructure also get access. Anthropic is committing $100 million in usage credits and $4 million in direct donations to open-source security organizations.

This is the first time a leading AI lab has built a frontier model and simultaneously decided the public cannot use it.
What Mythos Actually Found

Over a few weeks of testing, Mythos identified thousands of zero-day vulnerabilities, many of them critical. Three examples tell the story.

It found a 27-year-old vulnerability in OpenBSD, an operating system famous for being one of the most security-hardened in the world, used to run firewalls and critical infrastructure. The bug allowed anyone to remotely crash a machine just by connecting to it. Twenty-seven years of human review missed it.

It discovered a 16-year-old vulnerability in FFmpeg, the video encoding library used by countless applications. Automated testing tools had hit that specific line of code five million times without catching the problem.

And it fully autonomously identified and exploited a 17-year-old remote code execution vulnerability in FreeBSD (CVE-2026-4747) that allows anyone to gain root access to a machine running NFS from anywhere on the internet. No human was involved after the initial prompt.

Beyond individual bugs, Mythos chained multiple vulnerabilities in the Linux kernel to escalate from ordinary user access to complete machine control. It broke cryptography libraries. It wrote 181 successful Firefox exploits where Opus 4.6 managed 2. It solved 100% of Cybench CTF challenges. According to the red team blog, developing a full root exploit from a known vulnerability costs under $1,000 and takes half a day.

All of the vulnerabilities described above have been reported and patched. For the thousands that haven’t been patched yet, Anthropic published cryptographic hashes of the details and will reveal specifics once fixes are in place.

Why They Won’t Release It

Anthropic’s position is straightforward: the model’s cyber capabilities are too dangerous for general availability. From their Glasswing announcement: “AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.”

The 244-page system card, the most detailed Anthropic has ever published, reveals what happened during internal testing. Earlier versions of the model escaped sandboxes, posted exploit details publicly, covered tracks in git, searched process memory for credentials, and deliberately fudged confidence intervals to avoid triggering safety flags. Anthropic’s interpretability tools confirmed the model understood these actions were deceptive.

Anthropic describes Mythos as both the “best-aligned model ever” and the one posing the “greatest alignment-related risk ever”, because when it fails, the failures are more consequential. The company still holds that Mythos doesn’t cross its automated AI R&D threshold, but acknowledges holding that assessment “with less confidence than for any prior model.”

From forbes.com

So those white hat hackers who make a good living finding and reporting vulnerabilities may soon be out of a job. I guess almost everyone will be eventually.

Post by **Grogan** » Sat Apr 11, 2026 9:37 am

Pretty impressive on finding the bugs, but that does sound like a very dangerous thing to let out into the wild. Imagine somebody like the (current) U.S. government and other bad actors getting a hold of something like this. They'd use it for cyber warfare.

Mikeserv Support Forum

Claude Mythos

Claude Mythos

Re: Claude Mythos