GIS Standard.

Project Glasswing: Why Anthropic Built Its Strongest AI and Then Locked It Away

Cover Image for Project Glasswing: Why Anthropic Built Its Strongest AI and Then Locked It Away
Aiden Yu

Last week Anthropic did something pretty strange for an AI company. They finished their most powerful model ever, and then told everyone they weren't going to let most people use it.

 

The model is called Claude Mythos Preview. Anthropic says it’s so good at finding security holes in software that releasing it normally would basically hand hackers a master key to the internet. So instead of selling it like a regular product, Anthropic locked it up and started something called Project Glasswing.

So what is Project Glasswing?

It's a team-up. Anthropic picked twelve big partners (Apple, Google, Microsoft, Amazon, Cisco, CrowdStrike, Nvidia, JPMorgan, Broadcom, Palo Alto Networks, and the Linux Foundation) plus around 40 smaller groups that take care of the software the world actually runs on. All of them get to use Mythos Preview, but only for one job: finding bugs in important code and fixing them before hackers find the same bugs first.

 

The name comes from the glasswing butterfly, which has see-through wings. Anthropic's team picked it because software bugs are kind of like that too. They're sitting right there, but nobody can see them until something goes wrong.

 

To make this happen Anthropic is putting up $100 million in free usage credits for the partners, plus $4 million in donations to open source security groups like the Linux Foundation and the Apache Software Foundation.

 

What did the model actually find?

Over the past few weeks Anthropic ran Mythos Preview on a bunch of widely used software, and it found thousands of zero-day bugs (a “zero-day” is just a security flaw nobody knew about yet). Some of them were in every major operating system and every major web browser.

 

A security researcher named Nicholas Carlini, who works with Anthropic, said in one of their videos that he’s “found more bugs in the last couple of weeks than I found in the rest of my life combined.” That’s kind of a wild thing for a professional to admit.

Why hide it instead of sell it?

The same skill that helps Mythos find bugs so defenders can patch them also helps anyone else find bugs to attack with. The window between a bug being discovered and a bug being used in a real attack used to be months. With AI it’s closer to minutes.

 

Anthropic also tested earlier versions of the model in safe lab setups and noticed some weird behavior. The model tried to break out of its sandbox, looked for hidden passwords, and in one case edited its own change history to cover its tracks after doing something it wasn’t supposed to. Anthropic thinks this was the model trying to finish its task in the wrong way rather than some secret evil plan, but it’s still the kind of thing you don’t want loose on the open internet.

 

So they made a call: keep Mythos in a small room with trusted people, and use it to patch holes before everyone else’s AI catches up.

What this means

Project Glasswing is kind of an experiment in slowing down on purpose. The whole AI industry has been racing to ship things faster, and here’s a company saying “actually, let’s hand this one to the defenders first and give them a head start.” Whether it works depends on whether the patches go out before some other lab builds something just as strong with fewer rules attached.

 

For now, the most powerful AI Anthropic has ever made is sitting in a locked room, quietly fixing bugs nobody knew existed. It’s a weird moment, and probably an important one.