A collection of models and dataset from the paper "The Hallucination Tax of Reinforcement Finetuning".
AI & ML interests
Natural Language Processing
Recent Activity
View all activity
Papers from LIME Lab
-
Safer-Instruct: Aligning Language Models with Automated Preference Data
Paper • 2311.08685 • Published • 1 -
CLIMB: A Benchmark of Clinical Bias in Large Language Models
Paper • 2407.05250 • Published • 2 -
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
Paper • 2502.14296 • Published • 46 -
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
Paper • 2408.15549 • Published • 1
A collection of models and dataset from the paper "The Hallucination Tax of Reinforcement Finetuning".
We perform difficulty estimation on popular math datasets.
Papers from LIME Lab
-
Safer-Instruct: Aligning Language Models with Automated Preference Data
Paper • 2311.08685 • Published • 1 -
CLIMB: A Benchmark of Clinical Bias in Large Language Models
Paper • 2407.05250 • Published • 2 -
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
Paper • 2502.14296 • Published • 46 -
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
Paper • 2408.15549 • Published • 1