Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark Paper • 2304.03279 • Published Apr 6, 2023 • 1
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training Paper • 2406.10670 • Published Jun 15, 2024 • 4
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 50
Eliminating Position Bias of Language Models: A Mechanistic Approach Paper • 2407.01100 • Published Jul 1, 2024 • 8
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Paper • 2412.02674 • Published Dec 3, 2024
Quantifying Generalization Complexity for Large Language Models Paper • 2410.01769 • Published Oct 2, 2024 • 14
QTSumm: A New Benchmark for Query-Focused Table Summarization Paper • 2305.14303 • Published May 23, 2023
Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model Paper • 2209.11477 • Published Sep 23, 2022
ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples Paper • 2210.12374 • Published Oct 22, 2022
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12, 2024 • 70