view article Article Getting More from Your Test-Time Compute Budget with Portfolio Beam Search danelbaz • Feb 24 • 8
view article Article DeepMath: A lightweight math reasoning Agent with smolagents +1 danf, mber, moshew • Dec 4, 2025 • 40
view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 hyen, gaotianyu1350, houminmin, kding1, danf, moshew, cdq10131 • Apr 16, 2025 • 42
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou • Mar 24, 2025 • 20
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published Feb 13, 2025 • 16
view article Article Assisted Generation: a new direction toward low-latency text generation joaogante • May 11, 2023 • 78
view article Article Blazing Fast SetFit Inference with 🤗 Optimum Intel on Xeon +4 danielkorat, tomaarsen, orenpereg, moshew, echarlaix, aprabh2 • Apr 3, 2024 • 11
view article Article Faster Assisted Generation with Dynamic Speculation +5 jmamou, orenpereg, joaogante, lewtun, danielkorat, Nadav-Timor, moshew • Oct 8, 2024 • 51
view article Article SetFit: Efficient Few-Shot Learning Without Prompts +4 Unso, lewtun, luketheduke, danielkorat, orenpereg, moshew • Sep 26, 2022 • 40
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5, 2024 • 40
view article Article Our Transformers Code Agent beats the GAIA benchmark 🏅 m-ric, sergeipetrov • Jul 1, 2024 • 100
view article Article Training and Finetuning Embedding Models with Sentence Transformers tomaarsen • May 28, 2024 • 274
Accelerating Speculative Decoding using Dynamic Speculation Length Paper • 2405.04304 • Published May 7, 2024 • 2
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23, 2024 • 18
view article Article Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon +6 juliensimon, Haihao, antonyvance, MatrixYao, lianglv, gserochi, Debbh, kding1 • May 9, 2024 • 12
view article Article Introducing the Open Leaderboard for Hebrew LLMs! +2 Shaltiel, TalGeva, OmerKo, clefourrier • May 5, 2024 • 56
Improving Classification Performance With Human Feedback: Label a few, we label the rest Paper • 2401.09555 • Published Jan 17, 2024 • 6
H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models Paper • 2306.14048 • Published Jun 24, 2023 • 14