Monday, April 20, 2026
Mobile Offer

🎁 You've Got 1 Reward Left

Check if your device is eligible for instant bonuses.

Unlock Now
Survey Cash

🧠 Discover the Simple Money Trick

This quick task could pay you today — no joke.

See It Now
Top Deals

📦 Top Freebies Available Near You

Get hot mobile rewards now. Limited time offers.

Get Started
Game Offer

🎮 Unlock Premium Game Packs

Boost your favorite game with hidden bonuses.

Claim Now
Money Offers

💸 Earn Instantly With This Task

No fees, no waiting — your earnings could be 1 click away.

Start Earning
Crypto Airdrop

🚀 Claim Free Crypto in Seconds

Register & grab real tokens now. Zero investment needed.

Get Tokens
Food Offers

🍔 Get Free Food Coupons

Claim your free fast food deals instantly.

Grab Coupons
VIP Offers

🎉 Join Our VIP Club

Access secret deals and daily giveaways.

Join Now
Mystery Offer

🎁 Mystery Gift Waiting for You

Click to reveal your surprise prize now!

Reveal Gift
App Bonus

📱 Download & Get Bonus

New apps giving out free rewards daily.

Download Now
Exclusive Deals

💎 Exclusive Offers Just for You

Unlock hidden discounts and perks.

Unlock Deals
Movie Offer

🎬 Watch Paid Movies Free

Stream your favorite flicks with no cost.

Watch Now
Prize Offer

🏆 Enter to Win Big Prizes

Join contests and win amazing rewards.

Enter Now
Life Hack

💡 Simple Life Hack to Save Cash

Try this now and watch your savings grow.

Learn More
Top Apps

📲 Top Apps Giving Gifts

Download & get rewards instantly.

Get Gifts
Summer Drinks

🍹 Summer Cocktails Recipes

Make refreshing drinks at home easily.

Get Recipes

Latest Posts

Meet OpenMythos: An Open-Source PyTorch Reconstruction of Claude Mythos Where 770M Parameters Match a 1.3B Transformer


Anthropic has never published a technical paper on Claude Mythos. That has not stopped the research community from theorizing. A new open-source project called OpenMythos, released on GitHub by Kye Gomez, attempts something ambitious: a first-principles theoretical reconstruction of what the Claude Mythos architecture might actually be, built entirely in PyTorch and grounded in peer-reviewed research.

The project is not a leaked model, a fine-tune, or a distillation. It is a hypothesis rendered in code — and the hypothesis is specific enough to be falsifiable, which is what makes it interesting.

The Main Claim: Claude Mythos Is a Recurrent-Depth Transformer

OpenMythos proposes that Claude Mythos belongs to a class of architectures called Recurrent-Depth Transformers (RDTs), also referred to in the literature as Looped Transformers. The concept is meaningfully different from standard transformer stacks.

In a conventional transformer — GPT, LLaMA, Mistral — the model passes input through a sequence of unique layers, one after another, each with its own independent weights. More capability generally means more layers and more parameters. In a Recurrent-Depth Transformer, a fixed set of weights is applied iteratively across T loop steps within a single forward pass. The same weights run multiple times. Reasoning depth is not a function of how many parameters are stored, but of how many iterations are run at inference time.

Think of it less like reading a book and more like refining a draft: the model returns to the same computational block again and again, improving its internal representation with each pass.

How the Architecture is Structured

OpenMythos instantiates this as a three-part structure: Prelude → Recurrent Block → Coda. The Prelude and Coda are standard transformer layers that run exactly once. The Recurrent Block is the computational core, looped up to T=16 times.

At each loop step t, the hidden state is updated using the following rule:

ht+1 = A·ht + B·e + Transformer(ht, e)

Here, ht is the hidden state after loop iteration t, and e is the encoded input from the Prelude — re-injected at every step. The re-injection is deliberate: without it, the hidden state would drift away from the original input signal across deep loops. The learned matrices A and B govern how much of the previous hidden state and the encoded input carry forward at each step.

The FFN inside the Recurrent Block is not a standard feedforward layer. OpenMythos replaces it with a Mixture-of-Experts (MoE) layer following the design introduced in DeepSeekMoE: a large pool of fine-grained routed experts, with only a sparse top-K subset activated per token, alongside a small set of always-active shared experts that absorb common cross-domain patterns. Crucially, the router selects distinct expert subsets at each loop depth, meaning each iteration is computationally distinct despite sharing the same base weights. MoE provides domain breadth; looping provides reasoning depth.

Attention defaults to Multi-Latent Attention from DeepSeek-V2, which caches a compressed low-rank KV latent rather than full key/value tensors, yielding a 10–20× reduction in KV memory at production scale.

Reasoning in Continuous Latent Space

One of the most important properties of this architecture is that reasoning occurs entirely in continuous latent space. There is no intermediate token emission between loop steps — the model does not produce text mid-thought and then re-read it. This is structurally distinct from chain-of-thought prompting, where reasoning is externalized as token sequences, and has been formally analyzed in both Saunshi et al. (2025) and COCONUT (2024).

Saunshi et al. (2025) formally show that each loop iteration in an RDT is functionally equivalent to one step of chain-of-thought, but operating over real-valued vectors rather than discrete tokens. Continuous latent thoughts can also encode multiple alternative next steps simultaneously, enabling something closer to breadth-first search over the reasoning space within a single forward pass.

This also explains a concrete capability advantage. A standard transformer trained on 5-hop reasoning chains fails when tested on 10-hop chains at inference time — it has no mechanism to extend its depth beyond what it saw during training. A Recurrent-Depth Transformer handles this naturally: running more inference-time loops extends the reasoning chain without any retraining. Harder problems receive more compute; simpler ones exit early.

Solving the Stability Problem

Training looped models has historically been brittle. The hidden state ht can grow unboundedly across iterations — a failure mode called residual explosion. OpenMythos addresses this using a Linear Time-Invariant (LTI) injection constraint borrowed from the Parcae architecture (Prairie et al., 2026): the spectral radius of A, denoted ρ(A), is enforced to be less than 1 by construction, guaranteeing stability regardless of learning rate or gradient noise.

A second failure mode also exists at the other extreme: beyond a certain loop depth, excessive recurrence degrades predictions — the hidden state drifts past the solution and into noise. This is the ‘overthinking’ problem. Adaptive Computation Time (ACT) halting addresses it with a learned scalar per position that dynamically decides when to stop looping. Positions that are harder to process receive more computation; tokens that have already converged halt early.

Finally, Depth-Wise LoRA adapters introduce a small rank-r adaptation matrix at each iteration depth, giving each loop step slightly distinct behavior without adding substantial parameters — bridging the gap between pure weight-tying and fully distinct layers.

Why Parameter Efficiency Matters

The Parcae paper (Prairie et al., 2026) provides empirical grounding for the efficiency claim. At 770M parameters, an RDT matches a 1.3B standard transformer trained on identical data — roughly half the parameters for equivalent downstream quality. Optimal recurrence and optimal token count both follow power laws with consistent exponents across scales, establishing the first predictable scaling laws for looped training.

The implication is significant: reasoning depth scales with inference-time compute, not stored parameter count. This reframes one of the dominant assumptions in the scaling debate. The relevant axis may not be parameter count at training, but loop depth at inference.

What OpenMythos Contributes

OpenMythos provides four concrete research artifacts: a fully configurable PyTorch implementation of the RDT hypothesis with MoE FFN and Multi-Latent Attention; LTI-stable recurrent injection integrated as a first-class training primitive; depth-wise LoRA adapters enabling per-iteration behavioral differentiation; and a reproducible research baseline for studying looped transformer dynamics and inference-time reasoning depth.

Whether or not Mythos is actually an RDT, OpenMythos gives the research community something concrete and runnable — an implementation of an architecture class the literature increasingly suggests is underexplored, and one that may represent a fundamentally different path to capable AI than simply training bigger models.


Check out the Full Codes with Notebook here. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us






Source link

Mobile Offer

🎁 You've Got 1 Reward Left

Check if your device is eligible for instant bonuses.

Unlock Now
Survey Cash

🧠 Discover the Simple Money Trick

This quick task could pay you today — no joke.

See It Now
Top Deals

📦 Top Freebies Available Near You

Get hot mobile rewards now. Limited time offers.

Get Started
Game Offer

🎮 Unlock Premium Game Packs

Boost your favorite game with hidden bonuses.

Claim Now
Money Offers

💸 Earn Instantly With This Task

No fees, no waiting — your earnings could be 1 click away.

Start Earning
Crypto Airdrop

🚀 Claim Free Crypto in Seconds

Register & grab real tokens now. Zero investment needed.

Get Tokens
Food Offers

🍔 Get Free Food Coupons

Claim your free fast food deals instantly.

Grab Coupons
VIP Offers

🎉 Join Our VIP Club

Access secret deals and daily giveaways.

Join Now
Mystery Offer

🎁 Mystery Gift Waiting for You

Click to reveal your surprise prize now!

Reveal Gift
App Bonus

📱 Download & Get Bonus

New apps giving out free rewards daily.

Download Now
Exclusive Deals

💎 Exclusive Offers Just for You

Unlock hidden discounts and perks.

Unlock Deals
Movie Offer

🎬 Watch Paid Movies Free

Stream your favorite flicks with no cost.

Watch Now
Prize Offer

🏆 Enter to Win Big Prizes

Join contests and win amazing rewards.

Enter Now
Life Hack

💡 Simple Life Hack to Save Cash

Try this now and watch your savings grow.

Learn More
Top Apps

📲 Top Apps Giving Gifts

Download & get rewards instantly.

Get Gifts
Summer Drinks

🍹 Summer Cocktails Recipes

Make refreshing drinks at home easily.

Get Recipes

Latest Posts

Don't Miss

Stay in touch

To be updated with all the latest news, offers and special announcements.