Blog, Articles, and discussions

mmBERT: ModernBERT goes Multilingual

By September 9, 2025 • 100

Community Articles

Visual Document Retrieval Goes Multilingual

By January 10, 2025 guest • 76

Introducing smolagents: simple agents that write actions in code.

By December 31, 2024 • 1.13k

Finally, a Replacement for BERT: Introducing ModernBERT

By December 19, 2024 guest • 692

Bamba: Inference-Efficient Hybrid Mamba2 Model

By December 18, 2024 guest • 58

Welcome the Falcon 3 Family of Open Models!

By December 17, 2024 • 129

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

SmolVLM - small yet mighty Vision Language Model

By November 26, 2024 • 366

Letting Large Models Debate: The First Multilingual LLM Debate Competition

By November 20, 2024 guest • 32

Faster Text Generation with Self-Speculative Decoding

By November 20, 2024 • 62

Introduction to the Open Leaderboard for Japanese LLMs

By November 20, 2024 guest • 38

Judge Arena: Benchmarking LLMs as Evaluators

By November 19, 2024 guest • 58

Universal Assisted Generation: Faster Decoding with Any Assistant Model

By October 29, 2024 guest • 59

Scaling AI-based Data Processing with Hugging Face + Dask

By October 9, 2024 • 32

Faster Assisted Generation with Dynamic Speculation

By October 8, 2024 guest • 49

Blog, Articles, and discussions

mmBERT: ModernBERT goes Multilingual

There is no such thing as a tokenizer-free lunch

RexBERT: Encoders for a brave new world of E-Commerce

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

Model Quality: Hugging Face Is All You Need

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Qianfan-VL: A Milestone Achievement in Chinese Multimodal AI with Domestic Chips

PP-OCRv5 on Hugging Face: A Specialized Approach to OCR

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Code a simple RAG from scratch

Uncensor any LLM with abliteration

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Ground-up efforts to build large datasets for effective and accurate translation of Modi-Script documents into modern Marathi

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Small Language Models (SLM): A Comprehensive Overview

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

🌎 What kind of environmental impacts are AI companies disclosing? (And can we compare them?) 🌎

PrediBench: Testing AI models on prediction markets

Mastering Tensor Dimensions in Transformers

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

MamayLM, передова мовна модель для української мови