BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 39
OpenCoder Collection OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 4 items • Updated 1 day ago • 20
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 2 days ago • 74
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published 5 days ago • 27
🫐 ProX Projects Collection Collection for: "Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale" • 18 items • Updated 17 days ago • 2
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Paper • 2410.07137 • Published about 1 month ago • 6
ProX Dataset Collection a collection of pre-training corpora refined by ProX • 5 items • Updated 22 days ago • 5
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25 • 59
MagpieLM Collection Aligning LMs with Fully Open Recipe (data+training configs+logs) • 9 items • Updated Sep 22 • 15
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Paper • 2409.03753 • Published Sep 5 • 18
OLMoE Collection Artifacts for open mixture-of-experts language models. • 13 items • Updated Sep 25 • 25
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31 • 38
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25 • 86
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19 • 51
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 115
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper • 2408.08072 • Published Aug 15 • 31