Inseq: An Interpretability Toolkit for Sequence Generation Models Paper • 2302.13942 • Published Feb 27, 2023 • 1
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Paper • 2304.01373 • Published Apr 3, 2023 • 9
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model Paper • 2310.12611 • Published Oct 19, 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 31
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published Nov 10, 2024 • 17
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published Nov 10, 2024 • 17
Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation Paper • 2409.20385 • Published Sep 30, 2024
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation Paper • 2410.12722 • Published Oct 16, 2024 • 5
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks Paper • 2406.12066 • Published Jun 17, 2024 • 8
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks Paper • 2406.12066 • Published Jun 17, 2024 • 8
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks Paper • 2406.12066 • Published Jun 17, 2024 • 8
Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias Paper • 2405.05506 • Published May 9, 2024 • 1
Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly Paper • 2310.12300 • Published Oct 18, 2023 • 1
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? Paper • 2406.04391 • Published Jun 6, 2024 • 9
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets Paper • 2103.12028 • Published Mar 22, 2021 • 3
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting Paper • 2212.09535 • Published Dec 19, 2022 • 1