metadata

datasets:
  - THU-KEG/ReaRAG-20k
language:
  - en
base_model:
  - THUDM/glm-4-9b
pipeline_tag: question-answering
tags:
  - rag
  - reasoning

ReaRAG-9B

🤗 Dataset • 💻 GitHub • 📃 Paper

ReaRAG-9B is trained based on glm-4-9b, with enhanced capability to generate knowledge-guided reasoning chains for iterative RAG. The model supports a context window of up to 8k tokens.

Please refer to the Inference section in the GitHub repository for usage detail.

📚 Citation

If you use this dataset in your research or projects, please consider citing our work:

@article{lee2025rearag,
  title={ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation},
  author={Lee, Zhicheng and Cao, Shulin and Liu, Jinxin and Zhang, Jiajie and Liu, Weichuan and Che, Xiaoyin and Hou, Lei and Li, Juanzi},
  journal={arXiv preprint arXiv:2503.21729},
  year={2025}
}