Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence Paper • 2305.14334 • Published May 23, 2023 • 1
See, Say, and Segment: Teaching LMMs to Overcome False Premises Paper • 2312.08366 • Published Dec 13, 2023
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Paper • 2410.12851 • Published Oct 10 • 1
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights Paper • 2410.09008 • Published Oct 11 • 16
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27 • 24
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Paper • 2407.12854 • Published Jul 9 • 29
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline Paper • 2406.11939 • Published Jun 17 • 6
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline Paper • 2406.11939 • Published Jun 17 • 6
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline Paper • 2406.11939 • Published Jun 17 • 6
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline Paper • 2406.11939 • Published Jun 17 • 6
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Paper • 2406.04271 • Published Jun 6 • 28
Language models scale reliably with over-training and on downstream tasks Paper • 2403.08540 • Published Mar 13 • 14