R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1 • 26
Time To Impeach LLM-as-a-Judge: Programs are the Future of Evaluation Paper • 2506.10403 • Published Jun 12 • 1
The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators Paper • 2407.11004 • Published Jun 25, 2024
ScriptoriumWS: A Code Generation Assistant for Weak Supervision Paper • 2502.12366 • Published Feb 17
Evaluating Sample Utility for Data Selection by Mimicking Model Weights Paper • 2501.06708 • Published Jan 12 • 5
Multimodal Data Curation via Object Detection and Filter Ensembles Paper • 2401.12225 • Published Jan 5, 2024
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1 • 26
LETS Forecast: Learning Embedology for Time Series Forecasting Paper • 2506.06454 • Published Jun 6 • 5
BioRAG: A RAG-LLM Framework for Biological Question Reasoning Paper • 2408.01107 • Published Aug 2, 2024
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition Paper • 2410.05603 • Published Oct 8, 2024 • 11