Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published Mar 21 • 37
AgentRxiv: Towards Collaborative Autonomous Research Paper • 2503.18102 • Published Mar 23 • 24
WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation Paper • 2503.19065 • Published Mar 24 • 11
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 298
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 40
Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models Paper • 2408.02442 • Published Aug 5, 2024 • 21
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 77
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12, 2024 • 74
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2, 2024 • 124
view article Article Our Transformers Code Agent beats the GAIA benchmark! By m-ric and 1 other • Jul 1, 2024 • 94
Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus Paper • 2406.08598 • Published Jun 12, 2024 • 6
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 12 days ago • 209