GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning Paper • 2505.22661 • Published 1 day ago • 1
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
Grimoire is All You Need for Enhancing Large Language Models Paper • 2401.03385 • Published Jan 7, 2024 • 5
xFinder: Robust and Pinpoint Answer Extraction for Large Language Models Paper • 2405.11874 • Published May 20, 2024 • 7
HRDE: Retrieval-Augmented Large Language Models for Chinese Health Rumor Detection and Explainability Paper • 2407.00668 • Published Jun 30, 2024 • 3
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84