|
--- |
|
inference: false |
|
language: |
|
- en |
|
- zh |
|
tags: |
|
- instruction-finetuning |
|
task_categories: |
|
- text-generation |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
license: cc-by-nc-nd-4.0 |
|
--- |
|
|
|
<h1 align="center"> |
|
π xVerify-8B-I |
|
</h1> |
|
|
|
<p align="center"> |
|
<div style="display: flex; justify-content: center; gap: 10px;"> |
|
<a href="https://github.com/IAAR-Shanghai/xVerify"> |
|
<img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/> |
|
</a> |
|
<a href="https://huggingface.co/IAAR-Shanghai/xVerify-8B-I"> |
|
<img src="https://img.shields.io/badge/π€%20Hugging%20Face-xVerify--8B--I-yellow" alt="Hugging Face"/> |
|
</a> |
|
</div> |
|
</p> |
|
|
|
xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions. |
|
|
|
--- |
|
|
|
## β¨ Key Features |
|
|
|
### π Broad Applicability |
|
Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions. |
|
|
|
### βοΈ Handles Long Reasoning Chains |
|
Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity. |
|
|
|
### π Multilingual Support |
|
Primarily handles Chinese and English responses while remaining compatible with other languages. |
|
|
|
### π Powerful Equivalence Judgment |
|
- β Recognizes basic transformations like letter case changes and Greek letter conversions |
|
- β Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation) |
|
- β Determines semantic equivalence in natural language answers |
|
- β Matches multiple-choice responses by content rather than just option identifiers |
|
|
|
--- |
|
|
|
|
|
## π Citation |
|
|
|
```bibtex |
|
@article{xVerify, |
|
title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations}, |
|
author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li}, |
|
journal={arXiv preprint arXiv:2504.10481}, |
|
year={2025}, |
|
} |
|
``` |
|
|