AgentRewardBench AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published 13 days ago • 27 McGill-NLP/agent-reward-bench Viewer • Updated 4 days ago • 1.41k • 2.5k • 2 Running 4 4 Agent Reward Bench Demo 💻 Visualize agent interactions with WebArena tasks Running Agent Reward Bench Leaderboard 🥇 Leaderboard for AgentRewardBench
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published 13 days ago • 27
BM25S https://github.com/xhluca/bm25s BM25S: Orders of magnitude faster lexical search via eager sparse scoring Paper • 2407.03618 • Published Jul 4, 2024 • 13 xhluca/bm25s-nq-index Updated Jul 10, 2024 • 8 xhluca/bm25s-arguana-index Updated Jul 13, 2024 • 3 xhluca/bm25s-climate-fever-index Updated Jun 18, 2024 • 1
BM25S: Orders of magnitude faster lexical search via eager sparse scoring Paper • 2407.03618 • Published Jul 4, 2024 • 13