view article Article π¦Έπ»#1: Open-endedness and AI Agents β A Path from Generative to Creative AI? By Kseniase β’ Dec 25, 2024 β’ 14
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling Paper β’ 2504.13169 β’ Published Apr 17 β’ 39
view article Article The N Implementation Details of RLHF with PPO By vwxyzjn and 2 others β’ Oct 24, 2023 β’ 60
Long Reasoning Collection Datasets with reasoning traces for math and code (Train + Eval) β’ 49 items β’ Updated Mar 21 β’ 1
Reasoning Datasets Collection Distilled synthetic Reasoning datasets β’ 7 items β’ Updated Feb 2 β’ 61
π§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community β’ 24 items β’ Updated May 19 β’ 154
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper β’ 2411.06176 β’ Published Nov 9, 2024 β’ 46
Direct Preference Optimization Datasets Collection Datasets suitable for DPO based on having 'chosen', 'rejected', and 'prompt' columns. Created using librarian-bots/dataset-column-search-api β’ 5520 items β’ Updated Apr 6 β’ 6