view article Article "The Child That Surpassed Both Parents Through MRI-Guided Evolutionary Merge" 1 day ago โข 10
view article Article Introducing WM Bench: A Benchmark for Cognitive Intelligence in World Models 3 days ago โข 13
view article Article ๐๏ธ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do 22 days ago โข 38
view article Article MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning 24 days ago โข 15
view article Article Structural Problems in AI Benchmarking and the Case for a Unified Evaluation Framework 25 days ago โข 12
view article Article Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism? Feb 24 โข 17
FINAL Bench Collection World's First Functional Metacognition Benchmark. "Not how much AI knows โ but whether it knows what it doesn't know, and can fix it." โข 2 items โข Updated Feb 21 โข 4