AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published Oct 27 • 39
Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines Paper • 2412.14684 • Published 7 days ago • 1
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking Paper • 2410.12375 • Published Oct 16 • 3
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant Paper • 2410.18603 • Published Oct 24 • 32
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3 • 49
MALT: Improving Reasoning with Multi-Agent LLM Training Paper • 2412.01928 • Published 23 days ago • 39
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 117
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5 • 63
view article Article 🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦⬛ By anakin87 • Oct 21 • 18
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25 • 104
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 28 days ago • 289
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 224
MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications Paper • 2409.07314 • Published Sep 11 • 50
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains Paper • 2406.12045 • Published Jun 17 • 6
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6 • 51