Beyond Correlation: Interpretable Evaluation of Machine Translation Metrics Paper • 2410.05183 • Published Oct 7, 2024 • 1
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering Paper • 2503.14996 • Published Mar 19 • 3
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • 25 days ago • 623
Stress-Testing MGT Detecors via Stylistic Alignment Collection Dataset and Models for the ACL 2025 paper "Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors" • 10 items • Updated 30 days ago • 1
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization Paper • 2506.10920 • Published Jun 12 • 6
Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors Paper • 2505.24523 • Published May 30 • 9
Evaluating Lexical Proficiency in Neural Language Models Collection Public collection for our paper: "Evaluating Lexical Proficiency in Neural Language Models", C. Ciaccio, A. Miaschi, F. Dell'Orletta (ACL 2025) • 5 items • Updated May 26 • 2
Steering Large Language Models for Machine Translation Personalization Paper • 2505.16612 • Published May 22 • 6
ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models Paper • 2505.13180 • Published May 19 • 13
EuroBERT Collection Scaling Multilingual Encoders for European Languages • 4 items • Updated Mar 10 • 13
Gemma Neogenesis 💎🌍🇮🇹 Collection Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 12 items • Updated Mar 10 • 5
🇮🇹👓 LLaVA-NDiNO Collection HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20, 2024 • 3
Pythia Scaling Suite Collection Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated Feb 26 • 29
ITA-Bench: Italian Benchmarks for LLMs Collection A collection of Italian benchmarks for Large Language Models. See also our Github repo: https://github.com/SapienzaNLP/ita-bench • 19 items • Updated Dec 4, 2024 • 6
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 • Sep 3, 2024 • 36
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 40
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget Paper • 2408.00103 • Published Jul 31, 2024 • 24
🧩 Verbalized Rebus @ CLiC-it 2024 Collection Materials for the paper "Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses" • 14 items • Updated Mar 1 • 3