Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena Paper • 2310.05746 • Published Oct 9, 2023 • 1
CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization Paper • 2310.10134 • Published Oct 16, 2023 • 1
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions Paper • 2010.03205 • Published Oct 7, 2020
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics Paper • 2102.01672 • Published Feb 2, 2021
Unsupervised Enrichment of Persona-grounded Dialog with Background Stories Paper • 2106.08364 • Published Jun 15, 2021
Large Language Models as Zero-Shot Conversational Recommenders Paper • 2308.10053 • Published Aug 19, 2023
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos Paper • 2403.05535 • Published Mar 8, 2024 • 1
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models Paper • 2407.01725 • Published Jul 1, 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 70
WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries Paper • 2407.17468 • Published Jul 24, 2024
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25, 2024 • 114
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published about 1 month ago • 73
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published about 1 month ago • 73
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published about 1 month ago • 73
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published about 1 month ago • 73