Papers
arxiv:2507.22448

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Published on Jul 30
· Submitted by JingweiZuo on Jul 31
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Falcon-H1, a new series of large language models with a hybrid architecture combining Transformer-based attention and State Space Models, achieves state-of-the-art performance and efficiency across various tasks and sizes.

AI-generated summary

In this report, we introduce Falcon-H1, a new series of large language models (LLMs) featuring hybrid architecture designs optimized for both high performance and efficiency across diverse use cases. Unlike earlier Falcon models built solely on Transformer or Mamba architectures, Falcon-H1 adopts a parallel hybrid approach that combines Transformer-based attention with State Space Models (SSMs), known for superior long-context memory and computational efficiency. We systematically revisited model design, data strategy, and training dynamics, challenging conventional practices in the field. Falcon-H1 is released in multiple configurations, including base and instruction-tuned variants at 0.5B, 1.5B, 1.5B-deep, 3B, 7B, and 34B parameters. Quantized instruction-tuned models are also available, totaling over 30 checkpoints on Hugging Face Hub. Falcon-H1 models demonstrate state-of-the-art performance and exceptional parameter and training efficiency. The flagship Falcon-H1-34B matches or outperforms models up to 70B scale, such as Qwen3-32B, Qwen2.5-72B, and Llama3.3-70B, while using fewer parameters and less data. Smaller models show similar trends: the Falcon-H1-1.5B-Deep rivals current leading 7B-10B models, and Falcon-H1-0.5B performs comparably to typical 7B models from 2024. These models excel across reasoning, mathematics, multilingual tasks, instruction following, and scientific knowledge. With support for up to 256K context tokens and 18 languages, Falcon-H1 is suitable for a wide range of applications. All models are released under a permissive open-source license, underscoring our commitment to accessible and impactful AI research.

Community

Paper author Paper submitter
edited 2 days ago

This 80+ page report dives deep into the key design decisions behind Falcon-H1 - from architectural innovations and data strategies to training recipes that challenge conventional practices in LLM development 🔥

This comment has been hidden

It's about time we got a SOTA hybrid

·

Thank you for your massive effort to create this stepping point

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 18

Browse 18 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2507.22448 in a dataset README.md to link it from this page.

Spaces citing this paper 7

Collections including this paper 4