Anthropic Mythos - from the inside

An excerpt from a blog post by one of the maintainers of curl. A person with a lot of "real world experience" on one of the most widely used software tools in the planet.

Everything in the bulleted list below has been just copied-and-pasted (bold is mine, it was not in the original post). I have not added any comment, it is not necessary. This analysis goes in exactly the same direction that I hypothesized in my previous posts (here and here).

Before this first Mythos report, we had already scanned curl with several different very capable AI powered tools (I mean in addition to running a number of “normal” static code analyzers all the time, using the pickiest compiler options and doing fuzzing on it for years etc)... These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more.

The report concluded it found five “Confirmed security vulnerabilities”. I think using the term confirmed is a little amusing when the AI says it confidently by itself. Yes, the AI thinks they are confirmed, but the curl security team has a slightly different take.

Five issues felt like nothing as we had expected an extensive list. Once my curl security team fellows and I had poked on the this short list for a number of hours and dug into the details, we had trimmed the list down and were left with one confirmed vulnerability. The other four were three false positives (they highlighted shortcomings that are documented in API documentation) and the fourth we deemed “just a bug”.

The single confirmed vulnerability is going to end up a severity low.

The Mythos report on curl also contained a number of spotted bugs that it concluded were not vulnerabilities, much like any new code analyzer does when you run it on hundreds of thousands of lines of code. All the bugs in the report are being investigated and one by one we are fixing those that we agree with.

All in all about twenty bugs that are described and explained very nicely.

My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.

AI powered code analyzers are significantly better at finding security flaws and mistakes in source code than any traditional code analyzers did in the past. All modern AI models are good at this now. Anyone with time and some experimental spirits can find security problems now. The high quality chaos is real.

Any project that has not scanned their source code with AI powered tooling will likely find huge number of flaws, bugs and possible vulnerabilities with this new generation of tools. Mythos will, and so will many of the others.

It should be noted that the AI tools find the usual and established kind of errors we already know about. It just finds new instances of them.

We have not seen any AI so far report a vulnerability that would somehow be of a novel kind or something totally new. They do not reinvent the field in that way, but they do dig up more issues than any other tools did before.

Full post: Mythos finds a curl vulnerability

Alberto Bartoli - Blog

Cerca nel blog

Anthropic Mythos - from the inside

Commenti

Popular Posts

"Ingegneria deve essere difficile"

Perché studiare Analisi Matematica???

Cose che racconto nei corsi (e che poi si verificano) - UPDATED

On the Anthropic Mythos Preview - "too dangerous to release"

One must write correctly. One must explain oneself clearly.