VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30 • 19
interstellarninja/hermes_reasoning_tool_use Viewer • Updated about 17 hours ago • 51k • 754 • 143