Phantom: Subject-consistent video generation via cross-modal alignment Paper β’ 2502.11079 β’ Published 13 days ago β’ 51
Running 543 543 Vision Arena (Testing VLMs side-by-side) πΌ Analyze images to detect and label objects
Cosmos Tokenizer Collection A suite of image and video tokenizers β’ 13 items β’ Updated Jan 17 β’ 39
Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use Paper β’ 2410.24218 β’ Published Oct 31, 2024 β’ 6
Training Language Models to Self-Correct via Reinforcement Learning Paper β’ 2409.12917 β’ Published Sep 19, 2024 β’ 137
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends Paper β’ 2409.14195 β’ Published Sep 21, 2024 β’ 13
GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection Paper β’ 2111.14592 β’ Published Nov 29, 2021 β’ 1
A Survey on Dialog Management: Recent Advances and Challenges Paper β’ 2005.02233 β’ Published May 5, 2020
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents Paper β’ 2305.13040 β’ Published May 22, 2023 β’ 2
GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection Paper β’ 2111.14592 β’ Published Nov 29, 2021 β’ 1