AI & ML interests
A new generation of foundation models from first principles.
mlabonneΒ
authored 2
papers 3 months ago
Post
10327
New family of 1B models just dropped!
> LiquidAI/LFM2.5-1.2B-Base: 10T β 28T tokens
> LiquidAI/LFM2.5-1.2B-Instruct: new large-scale multi-stage RL
> LiquidAI/LFM2.5-1.2B-JP: our most polite model
> LiquidAI/LFM2.5-VL-1.6B: multi-image multilingual
> LiquidAI/LFM2.5-Audio-1.5B: 8x times faster, no quality loss
Super proud of this release π€
> LiquidAI/LFM2.5-1.2B-Base: 10T β 28T tokens
> LiquidAI/LFM2.5-1.2B-Instruct: new large-scale multi-stage RL
> LiquidAI/LFM2.5-1.2B-JP: our most polite model
> LiquidAI/LFM2.5-VL-1.6B: multi-image multilingual
> LiquidAI/LFM2.5-Audio-1.5B: 8x times faster, no quality loss
Super proud of this release π€
ykhrustalevΒ
authored a
paper 4 months ago
adityatadimetiΒ
authored a
paper 5 months ago
fernandofernandesΒ
authored 3
papers 5 months ago
Spectrum: Targeted Training on Signal to Noise Ratio
Paper β’ 2406.06623 β’ Published β’ 16
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
Paper β’ 2406.14971 β’ Published
Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit
Paper β’ 2506.06607 β’ Published β’ 3
zetianliΒ
authored a
paper 5 months ago
fernandofernandesΒ
authored a
paper 5 months ago
kohseiΒ
authored a
paper 5 months ago
sam-paechΒ
authored 3
papers 6 months ago
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Paper β’ 2312.06281 β’ Published β’ 2
Democratizing Diplomacy: A Harness for Evaluating Any Large Language Model on Full-Press Diplomacy
Paper β’ 2508.07485 β’ Published β’ 10
Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models
Paper β’ 2510.15061 β’ Published β’ 3
GAD-cellΒ
authored a
paper 7 months ago
Post
8432
LiquidAI/LFM2-8B-A1B just dropped!
8.3B params with only 1.5B active/token π
> Quality β 3β4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens β strong math/code/IF
8.3B params with only 1.5B active/token π
> Quality β 3β4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens β strong math/code/IF
s-jseΒ
authored 2
papers 7 months ago
Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Paper β’ 2509.23233 β’ Published β’ 4
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
Paper β’ 2509.19768 β’ Published β’ 7
Post
3885
βοΈ New drop of tiny task-specific models!
Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! β
These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.
You can deploy them today on-device or even on GPUs for big data operations!
LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a
Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! β
These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.
You can deploy them today on-device or even on GPUs for big data operations!
LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a