Back to blog
leaderboardopen sourcebenchmarkscommunity

Why Merged Models Keep Topping the Open LLM Leaderboard

MergeKit Team··5 min read

If you browse the Open LLM Leaderboard on any given day, you'll notice something striking: a disproportionate number of top-ranking models are merges, not traditionally fine-tuned models. This isn't a coincidence — it reflects fundamental advantages of the merging approach.

Diversity Beats Specialization

Benchmarks like MMLU, ARC, and HellaSwag test a wide range of capabilities. A model fine-tuned heavily on code might ace HumanEval but struggle on TruthfulQA. Merging lets you combine a code specialist, a reasoning specialist, and an instruction-following specialist into one model that performs well across all benchmarks simultaneously.

Ensemble Effects Without Ensemble Cost

Model merging achieves something like an ensemble — combining diverse "opinions" from multiple models — but packed into a single set of weights. You get the accuracy benefits of ensembling without the inference cost of running multiple models.

Rapid Iteration Cycles

Training a competitive model from scratch takes weeks and costs thousands of dollars. Merging takes minutes on consumer hardware. This means the community can iterate incredibly fast: try a merge, evaluate it, tweak the recipe, and try again. The sheer volume of experiments drives rapid improvement.

Implications for the Ecosystem

The dominance of merged models raises interesting questions about how we evaluate and share models. Provenance tracking becomes critical — which models contributed to a merge? Reproducibility matters — can someone else recreate the result from the recipe? These are exactly the problems MergeKit is designed to solve with its recipe registry, merge maps, and specialized leaderboards.

The era of merged models is just beginning. As the tools and infrastructure mature, we expect merging to become a standard part of every model builder's workflow. Stay tuned — or join the waitlist to be part of it.

Stay in the loop

Get notified when we publish new guides and launch new features.

Join the Waitlist