trade crypt

Anthropic Mythos replication: Vidoc mimics findings with public models

HomeMarketsAnthropic Mythos replication: Vidoc mimics findings with public models

-

Vidoc Security replicated Anthropic’s Mythos vulnerability findings by re-running the same cases Anthropic highlighted and reproducing multiple bugs using publicly available AI models. The tests used GPT-5.4 and Claude Opus 4.6 within an open-source coding agent and targeted a server file‑sharing protocol, a networking stack in a security-focused operating system, embedded video‑processing software, and two cryptographic libraries. Each automated scan ran at a cost under $30 per file.

This introduction summarizes the headline replication and key numeric details. Subsequent sections present the replication results.

Anthropic recently launched Claude Mythos, emphasizing its potential dangers for public deployment due to the vulnerabilities it exposes. Vidoc Security took on the challenge of replicating these findings using accessible AI models like GPT-5.4 and Claude Opus 4.6, implemented through an open-source coding agent called opencode.

The specific vulnerabilities Vidoc targeted included those highlighted by Anthropic: a server file-sharing protocol, the networking stack of a security-focused operating system, video-processing software in media platforms, and two cryptographic libraries essential for online digital identity verification. This replication underscores the capabilities of public AI models in identifying significant security flaws in widely used software components.

Vidoc Security replicated Anthropic’s Mythos findings using GPT-5.4 and Claude Opus 4.6 inside opencode, an open-source coding agent. The team re-ran the same cases Anthropic highlighted and executed scans without a Glasswing invite, without private API access, and without Anthropic’s internal stack. The implementation relied on publicly available AI models to explore supplied codebases for potential vulnerabilities. Each run operated within an open-tooling environment rather than Anthropic’s internal infrastructure.

The workflow mirrored Anthropic’s public description: provide a codebase, let it explore, parallelize attempts, and filter for signals. Vidoc built the same architecture using open tooling and implemented a planning agent alongside a detection agent. This configuration allowed parallelized exploration and automated signal filtering across target codebases.

Both GPT-5.4 and Claude Opus 4.6 reproduced two bug cases in all three runs. Claude Opus 4.6 rediscovered a bug in OpenBSD three times, while GPT-5.4 scored zero on that particular bug. Some returned findings were partial, surfacing the correct code area without identifying the exact root cause. Every scan cost under $30 per file.

These points summarize Vidoc Security’s replication methodology and the comparative outcomes reported. The account records different rediscovery rates between models and instances of partial recovery on certain bugs.

“We replicated Mythos findings in opencode using public models, not Anthropic’s private stack,” Dawid Moczadło wrote on X. “A better way to read Anthropic’s Mythos release is not ‘one lab has a magical model.’ It is: the economics of vulnerability discovery are changing.” “AI models are already good enough to narrow the search space, surface real leads, and sometimes recover the full root cause in battle-tested code,” Moczadło said on X.

Vidoc Security’s tests replicated Anthropic’s Mythos findings using publicly available AI tools, demonstrating that independent teams can reproduce the published vulnerability results. The replication has been described as emphasizing a change in the economics of vulnerability discovery. The outcome highlights evolving dynamics in how security problems are found and addressed.

This website and its articles do not provide any investment advisory services within the meaning of applicable regulations. The information published may be incomplete, outdated, or contain errors. The author makes no representation or warranty regarding the accuracy, completeness, or timeliness of the information presented. Use of this information is entirely at the reader’s own risk. Under no circumstances shall the author be held liable for financial decisions made on the basis of the content published on this website.
Crypto Fan
Crypto Fanhttps://calipsu.com
Calipsu.com is dedicated to providing clear, reliable, and accessible information about cryptocurrencies, blockchain technology, and decentralized finance (DeFi). Its mission is to help readers better understand a rapidly evolving ecosystem that is often complex, technical, and misunderstood. The platform covers a wide range of topics, from major blockchain networks and crypto assets to DeFi protocols, Web3 applications, and emerging trends. The website also publishes practical guides and tutorials that explain how decentralized tools function, such as wallets, staking mechanisms, lending protocols, and liquidity pools. These guides aim to describe processes and risks clearly, helping readers understand the mechanics behind DeFi rather than encouraging participation.

LATEST POSTS

GPT-Rosalind Tops Benchmarks in Life Sciences

GPT-Rosalind, OpenAI’s biology-focused AI, accelerates early drug discovery with top life sciences benchmarks and a Codex plugin.

XRP on Solana: wXRP surpasses 834k tokens

XRP on Solana goes live with wXRP, a 1:1 backed token by Hex Trust, enabling XRP across Solana, Ethereum, and other chains.

Kraken to acquire Bitnomial for $550 million, regulatory implications

Kraken to acquire Bitnomial for $550 million reshapes the U.S. derivatives landscape, signaling Payward’s push into regulated markets.

Ink Points: Season 1 underway at Kraken Pro

Explore Ink Points: Kraken Pro's Season 1 loyalty on Ink Layer 2, with four-week seasons, levels, and transparent, non-monetary rewards.

Follow us

116FansLike
745FollowersFollow
148FollowersFollow
trade crypt