InfiX-ai
/

InfiAlign-Qwen-7B-DPO

Text Generation

large-language-models

direct-preference-optimization

text-generation-inference

Model card Files Files and versions Community

sslu commited on Aug 9

Commit

bd48c6e

·

verified ·

1 Parent(s): b0ede44

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -5,6 +5,13 @@ base_model:
 ---
 # 🤖 Model Card: InfiX-ai/InfiAlign-Qwen-7B-DPO
 **InfiAlign** is a scalable and data-efficient post-training framework that combines supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) with a high-quality data selection pipeline to enhance reasoning in large language models.

 ---
 # 🤖 Model Card: InfiX-ai/InfiAlign-Qwen-7B-DPO
+<p align="center">
+  <a href="https://arxiv.org/abs/2508.05496"><img src="https://img.shields.io/badge/arXiv-Paper-b31b1b?style=flat&logo=arxiv&logoColor=white" alt="arXiv Paper"></a>
+  <a href="https://huggingface.co/papers/2508.05496"><img src="https://img.shields.io/badge/🤗%20HuggingFace-Daily%20Papers-ff9800?style=flat" alt="Hugging Face Paper"></a>
+  <a href="https://huggingface.co/InfiX-ai/InfiAlign-Qwen-7B-SFT"><img src="https://img.shields.io/badge/🤗%20HuggingFace-SFT%20Model-ff9800?style=flat" alt="Hugging Face SFT Model"></a>
+  <a href="https://huggingface.co/InfiX-ai/InfiAlign-Qwen-7B-DPO"><img src="https://img.shields.io/badge/🤗%20HuggingFace-DPO%20Model-ff9800?style=flat" alt="Hugging Face DPO Model"></a>
+  <a href="https://github.com/InfiXAI/InfiAlign"><img src="https://img.shields.io/badge/GitHub-Repository-181717?style=flat&logo=github&logoColor=white" alt="GitHub Repository"></a>
+</p>
 **InfiAlign** is a scalable and data-efficient post-training framework that combines supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) with a high-quality data selection pipeline to enhance reasoning in large language models.