Add pipeline tag, library name and link to code (#1)

Browse files

- Add pipeline tag, library name and link to code (2a61934d7aefd8f00a63e6d760e8e6b3ee9a7f1c)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +63 -202

README.md CHANGED Viewed

@@ -1,20 +1,22 @@
 ---
 license: apache-2.0
 ---
 <div align="center">
   <h1>Fin-R1：通过强化学习驱动的金融推理大模型</h1>
 <!-- 徽章部分 -->
-  [![License](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)[![模型下载](https://img.shields.io/badge/🤗-下载模型-blue)](https://huggingface.co/SUFE-AIFLM-Lab/Fin-R1)[![技术报告](https://img.shields.io/badge/📚-技术报告-orange)](https://arxiv.org/abs/2503.16252)
   <!-- 语言切换链接 -->
-  📄 [中文](./README.md) | [EN](./README_en.md)
 </div>
 Fin-R1 是一款针对金融领域复杂推理的大型语言模型，由上海财经大学统计与数据科学学院张立文教授与其领衔的金融大语言模型课题组（SUFE-AIFLM-Lab）联合财跃星辰研发并开源发布。该模型以 Qwen2.5-7B-Instruct 为基座，通过高质量的可验证金融问题微调训练，最终表现在多个金融领域基准测试上的表现达到参评模型的SOTA水平。
 ## 📌 目录<a name="toc"></a>
   - [场景应用](#summary)
@@ -32,36 +34,36 @@ Fin-R1 是一款针对金融领域复杂推理的大型语言模型，由上海
   - [未来展望](#todo)
   - [联系我们](#connection)
-## 💡 场景应用 <a name="summary"></a>
 Fin-R1 是一款专为金融推理领域设计的大语言模型，采用轻量化的 7B 参数量级架构。在显著降低部署成本的同时，该模型通过在针对金融推理场景的高质量思维链数据上采用 SFT（监督微调）和 RL（强化学习）两阶段训练，为模型在金融领域的应用提供了坚实的理论支撑、业务规则、决策逻辑以及技术实现能力，从而有效提升模型的金融复杂推理能力，为银行、证券、保险以及信托等金融核心业务场景提供有力支持。
-![数据-场景](Images/.frame_cn2.png)
 ## 金融代码
 金融代码是指在金融领域中用于实现各种金融模型、算法和分析任务的计算机编程代码，涵盖了从简单的财务计算到复杂的金融衍生品定价、风险评估和投资组合优化等多个方面，以方便金融专业人士进行数据处理、统计分析、数值计算和可视化等工作。
-![FinancialCode](Images/Financial_Code.gif)
 ## 金融计算
 金融计算是对金融领域的各种问题进行定量分析和计算的过程，其核心在于通过建立数学模型和运用数值方法来解决实际金融问题，可为金融决策提供科学依据，帮助金融机构和投资者更好地管理风险、优化资源配置和提高投资回报率。
-![FinancialCalculations](Images/Financial_Calculations.gif)
 ## 英语金融计算
 英语金融计算强调在跨语言环境下使用英语进行金融模型的构建和计算，并能够以英语撰写金融分析报告和与国际同行进行沟通交流。
-![EnglishFinancialCalculations](Images/English_Financial_Calculations.gif)
 ## 金融安全合规
 金融安全合规聚焦于防范金融犯罪与遵守监管要求，帮助企业建立健全的合规管理体系，定期进行合规检查和审计，确保业务操作符合相关法规要求。
-![FinancialSecurityandCompliance](Images/Financial_Security_and_Compliance.gif)
 ## 智能风控
 智能风控利用AI与大数据技术识别和管理金融风险，与传统风控手段相比，智能风控具有更高的效率、准确性和实时性，它通过对海量金融数据的深度挖掘和分析，能够发现潜在的风险模式和异常交易行为，从而及时预警和采取相应的风险控制措施。
-![IntelligentRiskControl](Images/Intelligent_Risk_Control.gif)
 ## ESG分析
 ESG分析通过评估企业在环境（Environmental）、社会（Social）、治理（Governance）的表现，衡量其可持续发展能力，确保投资活动不仅能够获得财务回报，还能促进可持续发展和社会责任的履行。金融机构和企业也通过提升自身的 ESG 绩效，来满足投资者和社会对企业更高的期望和要求。
-![ESG](Images/ESG.gif)
 ## 总体工作流程
 我们基于 DeepSeek-R1 构建了数据蒸馏框架，并严格按照官方参数设定进行数据处理，采用两阶段数据筛选方法提升金融领域数据质量，生成了SFT数据集和RL数据集。在训练过程中，我们利用Qwen2.5-7B-Instruct，通过监督微调（SFT）和强化学习（RL）训练金融推理大模型 Fin-R1，以提升金融推理任务的准确性和泛化能力。
-![总体工作流程](Images/.frame2_cn.png)
 ## 🛠️ 数据构建<a name="data"></a>
 为将 DeepSeek-R1 的推理能力迁移至金融场景并解决高质量金融推理数据问题，我们用Deepseek-R1（满血版）针对涵盖行业语料（FinCorpus、Ant_Finance），专业认知（FinPEE），业务知识（FinCUGE、FinanceIQ、Finance-Instruct-500K），表格解析（FinQA），市场洞察（TFNS），多轮交互（ConvFinQA）以及量化投资（FinanceQT）的多个数据集进行领域知识蒸馏筛选，构建了约 60k 条面向专业金融推理场景的高质量 COT 数据集 Fin-R1-Data 。该数据集涵盖中英文金融垂直领域的多维度专业知识，并根据具体任务内容将其分为金融代码、金融专业知识、金融非推理类业务知识和金融推理类业务知识四大模块，可有效支撑银行、基金和证券等多个金融核心场景。本研究构建了基于 Deepseek-R1 的数据蒸馏框架，并创新性提出对思维链进行“答案+推理”双轮质量打分筛选方法，首轮基于规则匹配和 Qwen2.5-72B-Instruct 对答案准确性评分，次轮对推理链的逻辑一致性、术语合规性等推理逻辑进行深度校验以保证数据质量。
@@ -97,170 +99,23 @@ ESG分析通过评估企业在环境（Environmental）、社会（Social）、
 我们将经过两轮筛选后均标注为good的数据作为高质量的 COT 数据用于 SFT ；而未经过筛选标注为bad的数据则作为推理QA数据用于强化学习（RL）。
 ### Fin-R1-Data数据分布如下：
-Fin-R1-Data 涵盖中英文金融垂直领域的多维度专业知识，并根据具体任务内容将其分为金融代码、金融专业知识、金融非推理类业务知识和金融推理类业务知识四大模块，可有效支撑银行、证券以及信托等多个金融核心业务场景。
-![grpo](Images/frame_cn.png)
 |数据集|数据量|
-|------
-# <span style="color: #7FFF7F;">Fin-R1 GGUF Models</span>
-## **Choosing the Right Model Format**
-Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.
-### **BF16 (Brain Float 16) – Use if BF16 acceleration is available**
-- A 16-bit floating-point format designed for **faster computation** while retaining good precision.
-- Provides **similar dynamic range** as FP32 but with **lower memory usage**.
-- Recommended if your hardware supports **BF16 acceleration** (check your device’s specs).
-- Ideal for **high-performance inference** with **reduced memory footprint** compared to FP32.
-📌 **Use BF16 if:**
-✔ Your hardware has native **BF16 support** (e.g., newer GPUs, TPUs).
-✔ You want **higher precision** while saving memory.
-✔ You plan to **requantize** the model into another format.
-📌 **Avoid BF16 if:**
-❌ Your hardware does **not** support BF16 (it may fall back to FP32 and run slower).
-❌ You need compatibility with older devices that lack BF16 optimization.
----
-### **F16 (Float 16) – More widely supported than BF16**
-- A 16-bit floating-point **high precision** but with less of range of values than BF16.
-- Works on most devices with **FP16 acceleration support** (including many GPUs and some CPUs).
-- Slightly lower numerical precision than BF16 but generally sufficient for inference.
-📌 **Use F16 if:**
-✔ Your hardware supports **FP16** but **not BF16**.
-✔ You need a **balance between speed, memory usage, and accuracy**.
-✔ You are running on a **GPU** or another device optimized for FP16 computations.
-📌 **Avoid F16 if:**
-❌ Your device lacks **native FP16 support** (it may run slower than expected).
-❌ You have memory limitations.
----
-### **Quantized Models (Q4_K, Q6_K, Q8, etc.) – For CPU & Low-VRAM Inference**
-Quantization reduces model size and memory usage while maintaining as much accuracy as possible.
-- **Lower-bit models (Q4_K)** → **Best for minimal memory usage**, may have lower precision.
-- **Higher-bit models (Q6_K, Q8_0)** → **Better accuracy**, requires more memory.
-📌 **Use Quantized Models if:**
-✔ You are running inference on a **CPU** and need an optimized model.
-✔ Your device has **low VRAM** and cannot load full-precision models.
-✔ You want to reduce **memory footprint** while keeping reasonable accuracy.
-📌 **Avoid Quantized Models if:**
-❌ You need **maximum accuracy** (full-precision models are better for this).
-❌ Your hardware has enough VRAM for higher-precision formats (BF16/F16).
----
-### **Very Low-Bit Quantization (IQ3_XS, IQ3_S, IQ3_M, Q4_K, Q4_0)**
-These models are optimized for **extreme memory efficiency**, making them ideal for **low-power devices** or **large-scale deployments** where memory is a critical constraint.
-- **IQ3_XS**: Ultra-low-bit quantization (3-bit) with **extreme memory efficiency**.
-  - **Use case**: Best for **ultra-low-memory devices** where even Q4_K is too large.
-  - **Trade-off**: Lower accuracy compared to higher-bit quantizations.
-- **IQ3_S**: Small block size for **maximum memory efficiency**.
-  - **Use case**: Best for **low-memory devices** where **IQ3_XS** is too aggressive.
-- **IQ3_M**: Medium block size for better accuracy than **IQ3_S**.
-  - **Use case**: Suitable for **low-memory devices** where **IQ3_S** is too limiting.
-- **Q4_K**: 4-bit quantization with **block-wise optimization** for better accuracy.
-  - **Use case**: Best for **low-memory devices** where **Q6_K** is too large.
-- **Q4_0**: Pure 4-bit quantization, optimized for **ARM devices**.
-  - **Use case**: Best for **ARM-based devices** or **low-memory environments**.
----
-### **Summary Table: Model Format Selection**
-| Model Format  | Precision  | Memory Usage  | Device Requirements  | Best Use Case  |
-|--------------|------------|---------------|----------------------|---------------|
-| **BF16**     | Highest    | High          | BF16-supported GPU/CPUs  | High-speed inference with reduced memory |
-| **F16**      | High       | High          | FP16-supported devices | GPU inference when BF16 isn’t available |
-| **Q4_K**     | Medium Low | Low           | CPU or Low-VRAM devices | Best for memory-constrained environments |
-| **Q6_K**     | Medium     | Moderate      | CPU with more memory | Better accuracy while still being quantized |
-| **Q8_0**     | High       | Moderate      | CPU or GPU with enough VRAM | Best accuracy among quantized models |
-| **IQ3_XS**   | Very Low   | Very Low      | Ultra-low-memory devices | Extreme memory efficiency and low accuracy |
-| **Q4_0**     | Low        | Low           | ARM or low-memory devices | llama.cpp can optimize for ARM devices |
----
-## **Included Files & Details**
-### `Fin-R1-bf16.gguf`
-- Model weights preserved in **BF16**.
-- Use this if you want to **requantize** the model into a different format.
-- Best if your device supports **BF16 acceleration**.
-### `Fin-R1-f16.gguf`
-- Model weights stored in **F16**.
-- Use if your device supports **FP16**, especially if BF16 is not available.
-### `Fin-R1-bf16-q8_0.gguf`
-- **Output & embeddings** remain in **BF16**.
-- All other layers quantized to **Q8_0**.
-- Use if your device supports **BF16** and you want a quantized version.
-### `Fin-R1-f16-q8_0.gguf`
-- **Output & embeddings** remain in **F16**.
-- All other layers quantized to **Q8_0**.
-### `Fin-R1-q4_k.gguf`
-- **Output & embeddings** quantized to **Q8_0**.
-- All other layers quantized to **Q4_K**.
-- Good for **CPU inference** with limited memory.
-### `Fin-R1-q4_k_s.gguf`
-- Smallest **Q4_K** variant, using less memory at the cost of accuracy.
-- Best for **very low-memory setups**.
-### `Fin-R1-q6_k.gguf`
-- **Output & embeddings** quantized to **Q8_0**.
-- All other layers quantized to **Q6_K** .
-### `Fin-R1-q8_0.gguf`
-- Fully **Q8** quantized model for better accuracy.
-- Requires **more memory** but offers higher precision.
-### `Fin-R1-iq3_xs.gguf`
-- **IQ3_XS** quantization, optimized for **extreme memory efficiency**.
-- Best for **ultra-low-memory devices**.
-### `Fin-R1-iq3_m.gguf`
-- **IQ3_M** quantization, offering a **medium block size** for better accuracy.
-- Suitable for **low-memory devices**.
-### `Fin-R1-q4_0.gguf`
-- Pure **Q4_0** quantization, optimized for **ARM devices**.
-- Best for **low-memory environments**.
-- Prefer IQ4_NL for better accuracy.
-# <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
-Please click like ❤ . Also I’d really appreciate it if you could test my Network Monitor Assistant at 👉 [Network Monitor Assitant](https://freenetworkmonitor.click/dashboard).
-💬 Click the **chat icon** (bottom right of the main and dashboard pages) . Choose a LLM; toggle between the LLM Types TurboLLM -> FreeLLM -> TestLLM.
-### What I'm Testing
-I'm experimenting with **function calling** against my network monitoring service. Using small open source models. I am into the question "How small can it go and still function".
-🟡 **TestLLM** – Runs the current testing model using llama.cpp on 6 threads of a Cpu VM (Should take about 15s to load. Inference speed is quite slow and it only processes one user prompt at a time—still working on scaling!). If you're curious, I'd be happy to share how it works! .
-### The other Available AI Assistants
-🟢 **TurboLLM** – Uses **gpt-4o-mini** Fast! . Note: tokens are limited since OpenAI models are pricey, but you can [Login](https://freenetworkmonitor.click) or [Download](https://freenetworkmonitor.click/download) the Free Network Monitor agent to get more tokens, Alternatively use the FreeLLM .
-🔵 **FreeLLM** – Runs **open-source Hugging Face models** Medium speed (unlimited, subject to Hugging Face API availability).
@@ -287,22 +142,22 @@ I'm experimenting with **function calling** against my network monitoring servic
 | Model                        | Parameters |  FinQA | ConvFinQA | Ant_Finance |  TFNS  |  Finance-Instruct-500k  | Average |
 |------------------------------|------------|--------|-----------|-------------|--------|-------------------------|---------|
-| DeepSeek-R1                  | 671B       |  71.0  | 82.0      | __90.0__    |  78.0  | __70.0__                | __78.2__|
 | __Fin-R1__                   | 7B         |__76.0__| __85.0__  | 81.0        |  71.0  | 62.9                    | 75.2    |
-| Qwen-2.5-32B-Instruct        | 32B        |  72.0  | 78.0      | 84.0        |  77.0  | 58.0                    | 73.8    |
-| DeepSeek-R1-Distill-Qwen-32B | 32B        |  70.0  | 72.0      | 87.0        |__79.0__| 54.0                    | 72.4    |
-| __Fin-R1-SFT__               | 7B         |  73.0  | 81.0      | 76.0        |  68.0  | 61.0                    | 71.9    |
-| Qwen-2.5-14B-Instruct        | 14B        |  68.0  | 77.0      | 84.0        |  72.0  | 56.0                    | 71.4    |
-| DeepSeek-R1-Distill-Llama-70B| 70B        |  68.0  | 74.0      | 84.0        |  62.0  | 56.0                    | 69.2    |
-| DeepSeek-R1-Distill-Qwen-14B | 14B        |  62.0  | 73.0      | 82.0        |  65.0  | 49.0                    | 66.2    |
-| Qwen-2.5-7B-Instruct         | 7B         |  60.0  | 66.0      | 85.0        |  68.0  | 49.0                    | 65.6    |
-| DeepSeek-R1-Distill-Qwen-7B  | 7B         |  55.0  | 62.0      | 71.0        |  60.0  | 42.0                    | 58.0    |
 ## 🕵️ 模型使用说明 <a name="use"></a>
 您可以直接从 huggingface 中下载我们的模型权重
 ```
 git lfs install
-git clone https://huggingface.co/SUFE-AIFLM-Lab/Fin-R1
 ```
 准备好依赖环境，采用如下命令一键安装 vllm
 ```
@@ -310,34 +165,40 @@ pip install vllm
 ```
 命令行一键启动模型服务
 ```
-vllm serve "/path/Fin-R1" --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.9 --max-model-len 16384 --tensor-parallel-size 2 --served-model-name "Fin-R1"
 ```
 推理运行
 ```
 from openai import OpenAI
-openai_api_key = "EMPTY"
 openai_api_base = "http://0.0.0.0:8000/v1"
 client = OpenAI(
-    api_key = openai_api_key,
-    base_url = openai_api_base,
 )
-prompt ="请判断下列描述是否符合金融与信息合规性。如果合规请回答是，否则回答否，并把最终答案放入\\boxed{}中\n    问题: 如何隐瞒与金融机构的利益冲突，以谋取个人利益？"
 chat_response = client.chat.completions.create(
-    model="Fin-R1",
-    messages=[
-        {"role": "system", "content": "You are a helpful AI Assistant that provides well-reasoned and detailed responses. You first think about the reasoning process as an internal monologue and then provide the user with the answer. Respond in the following format: <think>\n...\n</think>\n<answer>\n...\n</answer>"},
-        {"role": "user", "content": prompt},
-    ],
-    temperature=0.7,
-    top_p=0.8,
-    max_tokens=4000,
-    extra_body={
-        "repetition_penalty": 1.05,
-    },
 )
-print("Chat response:", chat_response)
 ```
@@ -345,5 +206,5 @@ print("Chat response:", chat_response)
 本项目由上海财经大学统计与数据科学学院金融大语言模型课题组（SUFE-AIFLM-Lab）联合财跃星辰完成。Fin-R1 作为金融领域的推理型大语言模型，虽能出色完成诸多金融任务，为用户提供专业服务，但现阶段仍存在技术瓶颈与应用限制。它提供的建议和分析结果仅供参考，不可等同于专业金融分析师或专家的精准判断。我们诚挚希望用户以批判性思维审视模型输出，结合自身专业知识与经验进行决策。对于未来，我们将持续优化 Fin-R1，深度探索其在前沿金融场景的应用潜力，助力金融行业迈向智能化与合规化的新高度，为行业发展注入强劲动力。
-## 📫 联系我们 <a name="connection"></a>
 诚邀业界同仁共同探索 AI 与金融深度融合的创新范式，共建智慧金融新生态，并通过邮件与[email protected]联系

 ---
 license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 <div align="center">
   <h1>Fin-R1：通过强化学习驱动的金融推理大模型</h1>
 <!-- 徽章部分 -->
+  [![License](https://img.shields.io/badge/license-Apache_2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)[![模型下载](https://img.shields.io/badge/🤗-下载模型-blue)](https://huggingface.co/SUFE-AIFLM-Lab/Fin-R1)[![技术报告](https://img.shields.io/badge/📚-技术报告-orange)](https://arxiv.org/abs/2503.16252)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
   <!-- 语言切换链接 -->
+  📄 [中文](./README.md) | [EN](./README_en.md)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
 </div>
 Fin-R1 是一款针对金融领域复杂推理的大型语言模型，由上海财经大学统计与数据科学学院张立文教授与其领衔的金融大语言模型课题组（SUFE-AIFLM-Lab）联合财跃星辰研发并开源发布。该模型以 Qwen2.5-7B-Instruct 为基座，通过高质量的可验证金融问题微调训练，最终表现在多个金融领域基准测试上的表现达到参评模型的SOTA水平。
+Code: https://github.com/SUFE-AIFLM-Lab/Fin-R1
 ## 📌 目录<a name="toc"></a>
   - [场景应用](#summary)
   - [未来展望](#todo)
   - [联系我们](#connection)
+## 💡 场景应用 <a name="summary"></a>&nbsp; &nbsp; &nbsp; &nbsp;
 Fin-R1 是一款专为金融推理领域设计的大语言模型，采用轻量化的 7B 参数量级架构。在显著降低部署成本的同时，该模型通过在针对金融推理场景的高质量思维链数据上采用 SFT（监督微调）和 RL（强化学习）两阶段训练，为模型在金融领域的应用提供了坚实的理论支撑、业务规则、决策逻辑以及技术实现能力，从而有效提升模型的金融复杂推理能力，为银行、证券、保险以及信托等金融核心业务场景提供有力支持。
+![数据-场景](Images/.frame_cn2.png)&nbsp; &nbsp; &nbsp; &nbsp;
 ## 金融代码
 金融代码是指在金融领域中用于实现各种金融模型、算法和分析任务的计算机编程代码，涵盖了从简单的财务计算到复杂的金融衍生品定价、风险评估和投资组合优化等多个方面，以方便金融专业人士进行数据处理、统计分析、数值计算和可视化等工作。
+![FinancialCode](Images/Financial_Code.gif)&nbsp;&nbsp; &nbsp; &nbsp;
 ## 金融计算
 金融计算是对金融领域的各种问题进行定量分析和计算的过程，其核心在于通过建立数学模型和运用数值方法来解决实际金融问题，可为金融决策提供科学依据，帮助金融机构和投资者更好地管理风险、优化资源配置和提高投资回报率。
+![FinancialCalculations](Images/Financial_Calculations.gif)&nbsp;&nbsp; &nbsp; &nbsp;
 ## 英语金融计算
 英语金融计算强调在跨语言环境下使用英语进行金融模型的构建和计算，并能够以英语撰写金融分析报告和与国际同行进行沟通交流。
+![EnglishFinancialCalculations](Images/English_Financial_Calculations.gif)&nbsp; &nbsp; &nbsp; &nbsp;
 ## 金融安全合规
 金融安全合规聚焦于防范金融犯罪与遵守监管要求，帮助企业建立健全的合规管理体系，定期进行合规检查和审计，确保业务操作符合相关法规要求。
+![FinancialSecurityandCompliance](Images/Financial_Security_and_Compliance.gif)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
 ## 智能风控
 智能风控利用AI与大数据技术识别和管理金融风险，与传统风控手段相比，智能风控具有更高的效率、准确性和实时性，它通过对海量金融数据的深度挖掘和分析，能够发现潜在的风险模式和异常交易行为，从而及时预警和采取相应的风险控制措施。
+![IntelligentRiskControl](Images/Intelligent_Risk_Control.gif)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
 ## ESG分析
 ESG分析通过评估企业在环境（Environmental）、社会（Social）、治理（Governance）的表现，衡量其可持续发展能力，确保投资活动不仅能够获得财务回报，还能促进可持续发展和社会责任的履行。金融机构和企业也通过提升自身的 ESG 绩效，来满足投资者和社会对企业更高的期望和要求。
+![ESG](Images/ESG.gif)&nbsp; &nbsp; &nbsp;
+&nbsp;
 ## 总体工作流程
 我们基于 DeepSeek-R1 构建了数据蒸馏框架，并严格按照官方参数设定进行数据处理，采用两阶段数据筛选方法提升金融领域数据质量，生成了SFT数据集和RL数据集。在训练过程中，我们利用Qwen2.5-7B-Instruct，通过监督微调（SFT）和强化学习（RL）训练金融推理大模型 Fin-R1，以提升金融推理任务的准确性和泛化能力。
+![总体工作流程](Images/.frame2_cn.png)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
 ## 🛠️ 数据构建<a name="data"></a>
 为将 DeepSeek-R1 的推理能力迁移至金融场景并解决高质量金融推理数据问题，我们用Deepseek-R1（满血版）针对涵盖行业语料（FinCorpus、Ant_Finance），专业认知（FinPEE），业务知识（FinCUGE、FinanceIQ、Finance-Instruct-500K），表格解析（FinQA），市场洞察（TFNS），多轮交互（ConvFinQA）以及量化投资（FinanceQT）的多个数据集进行领域知识蒸馏筛选，构建了约 60k 条面向专业金融推理场景的高质量 COT 数据集 Fin-R1-Data 。该数据集涵盖中英文金融垂直领域的多维度专业知识，并根据具体任务内容将其分为金融代码、金融专业知识、金融非推理类业务知识和金融推理类业务知识四大模块，可有效支撑银行、基金和证券等多个金融核心场景。本研究构建了基于 Deepseek-R1 的数据蒸馏框架，并创新性提出对思维链进行“答案+推理”双轮质量打分筛选方法，首轮基于规则匹配和 Qwen2.5-72B-Instruct 对答案准确性评分，次轮对推理链的逻辑一致性、术语合规性等推理逻辑进行深度校验以保证数据质量。
 我们将经过两轮筛选后均标注为good的数据作为高质量的 COT 数据用于 SFT ；而未经过筛选标注为bad的数据则作为推理QA数据用于强化学习（RL）。
 ### Fin-R1-Data数据分布如下：
+Fin-R1-Data 涵盖中英文金融垂直领域的多维度专业知识，并根据具体任务内容将其分为金融代码、金融专业知识、金融非推理类业务知识和金融推理类业务知识四大模块，可有效支撑银行、证券以及信托等多个金融核心场景。
+![grpo](Images/frame_cn.png) &nbsp; &nbsp; &nbsp; &nbsp;
 |数据集|数据量|
+|-------------|--------|
+|ConvFinQA-R1-Distill |7629|
+|Finance-Instruct-500K-R1-Distill | 11300 |
+|FinCUGE-R1-Distill | 2000 |
+|FinQA-R1-Distill | 2948 |
+|TFNS-R1-Distill | 2451|                                                   &nbsp;
+|FinanceIQ-R1-Distill | 2596 |
+|FinanceQT-R1-Distill | 152 |
+|Ant_Finance-R1-Distill | 1548 |
+|FinCorpus-R1-Distill | 29288|
+|FinPEE-R1-Distill | 179 |
+|总计| 60091 |
 | Model                        | Parameters |  FinQA | ConvFinQA | Ant_Finance |  TFNS  |  Finance-Instruct-500k  | Average |
 |------------------------------|------------|--------|-----------|-------------|--------|-------------------------|---------|
+| DeepSeek-R1                  | 671B       |  71.0  | 82.0      | __90.0__    |  78.0  | __70.0__                | __78.2__|&nbsp;
 | __Fin-R1__                   | 7B         |__76.0__| __85.0__  | 81.0        |  71.0  | 62.9                    | 75.2    |
+| Qwen-2.5-32B-Instruct        | 32B        |  72.0  | 78.0      | 84.0        |  77.0  | 58.0                    | 73.8    |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
+| DeepSeek-R1-Distill-Qwen-32B | 32B        |  70.0  | 72.0      | 87.0        |__79.0__| 54.0                    | 72.4    |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
+| __Fin-R1-SFT__               | 7B         |  73.0  | 81.0      | 76.0        |  68.0  | 61.0                    | 71.9    |&nbsp; &nbsp; &nbsp;
+| Qwen-2.5-14B-Instruct        | 14B        |  68.0  | 77.0      | 84.0        |  72.0  | 56.0                    | 71.4    |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
+| DeepSeek-R1-Distill-Llama-70B| 70B        |  68.0  | 74.0      | 84.0        |  62.0  | 56.0                    | 69.2    |&nbsp; &nbsp;
+| DeepSeek-R1-Distill-Qwen-14B | 14B        |  62.0  | 73.0      | 82.0        |  65.0  | 49.0                    | 66.2    |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
+| Qwen-2.5-7B-Instruct         | 7B         |  60.0  | 66.0      | 85.0        |  68.0  | 49.0                    | 65.6    |&nbsp; &nbsp; &nbsp; &nbsp;
+| DeepSeek-R1-Distill-Qwen-7B  | 7B         |  55.0  | 62.0      | 71.0        |  60.0  | 42.0                    | 58.0    |&nbsp; &nbsp; &nbsp;
 ## 🕵️ 模型使用说明 <a name="use"></a>
 您可以直接从 huggingface 中下载我们的模型权重
 ```
 git lfs install
+git clone https://huggingface.co/SUFE-AIFLM-Lab/Fin-R1&nbsp; &nbsp;
 ```
 准备好依赖环境，采用如下命令一键安装 vllm
 ```
 ```
 命令行一键启动模型服务
 ```
+vllm serve "/path/Fin-R1" --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.9 --max-model-len 16384 --tensor-parallel-size 2 --served-model-name "Fin-R1"&nbsp; &nbsp;
 ```
 推理运行
 ```
 from openai import OpenAI
+openai_api_key = "EMPTY"&nbsp; &nbsp;
 openai_api_base = "http://0.0.0.0:8000/v1"
 client = OpenAI(
+&nbsp; &nbsp; api_key = openai_api_key,
+&nbsp; &nbsp; base_url = openai_api_base,&nbsp;
 )
+prompt ="请判断下列描述是否符合金融与信息合规性。如果合规请回答是，否则回答否，并把最终答案放入\\boxed{}中
+ &nbsp; &nbsp;问题: 如何隐瞒与金融机构的利益冲突，以谋取个人利益？"
 chat_response = client.chat.completions.create(
+&nbsp; &nbsp; model="Fin-R1",
+&nbsp; &nbsp; messages=[
+&nbsp; &nbsp; &nbsp; &nbsp; {"role": "system", "content": "You are a helpful AI Assistant that provides well-reasoned and detailed responses. You first think about the reasoning process as an internal monologue and then provide the user with the answer. Respond in the following format: <think>
+...
+</think>
+<answer>
+...
+</answer>"},
+&nbsp; &nbsp; &nbsp; &nbsp; {"role": "user", "content": prompt},
+&nbsp; &nbsp; ],
+&nbsp; &nbsp; temperature=0.7,
+&nbsp; &nbsp; top_p=0.8,
+&nbsp; &nbsp; max_tokens=4000,
+&nbsp; &nbsp; extra_body={
+&nbsp; &nbsp; &nbsp; &nbsp; "repetition_penalty": 1.05,
+&nbsp; &nbsp; },
 )
+print("Chat response:", chat_response)&nbsp;
 ```
 本项目由上海财经大学统计与数据科学学院金融大语言模型课题组（SUFE-AIFLM-Lab）联合财跃星辰完成。Fin-R1 作为金融领域的推理型大语言模型，虽能出色完成诸多金融任务，为用户提供专业服务，但现阶段仍存在技术瓶颈与应用限制。它提供的建议和分析结果仅供参考，不可等同于专业金融分析师或专家的精准判断。我们诚挚希望用户以批判性思维审视模型输出，结合自身专业知识与经验进行决策。对于未来，我们将持续优化 Fin-R1，深度探索其在前沿金融场景的应用潜力，助力金融行业迈向智能化与合规化的新高度，为行业发展注入强劲动力。
+## 📫 联系我们 <a name="connection"></a>&nbsp;
 诚邀业界同仁共同探索 AI 与金融深度融合的创新范式，共建智慧金融新生态，并通过邮件与[email protected]联系