Add library_name and pipeline_tag

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -1,6 +1,9 @@
1
  ---
2
  license: mit
 
 
3
  ---
 
4
  <div align="center">
5
  <h1 align="center"> KnowRL </h1>
6
  <h3 align="center"> Exploring Knowledgeable Reinforcement Learning for Factuality </h3>
@@ -62,8 +65,7 @@ print(response)
62
  ### Using `huggingface-cli`
63
  You can also download the model from the command line using `huggingface-cli`.
64
 
65
- ```
66
- bash
67
  huggingface-cli download zjunlp/KnowRL-DeepSeek-R1-Distill-Qwen-7B --local-dir KnowRL-DeepSeek-R1-Distill-Qwen-7B
68
  ```
69
 
@@ -72,7 +74,7 @@ huggingface-cli download zjunlp/KnowRL-DeepSeek-R1-Distill-Qwen-7B --local-dir K
72
  The model's training process involves two distinct stages, using the data from the `zjunlp/KnowRL-Train-Data` dataset.
73
 
74
  * **Stage 1: Cold-Start SFT**: The base model undergoes supervised fine-tuning on the `knowrl_coldstart.json` dataset. This stage helps the model adopt a fact-based, slow-thinking response structure.
75
- * **Stage 2: Knowledgeable RL**: The SFT-tuned model is further trained using reinforcement learning (GRPO). The reward function combines a correctness reward with a factuality reward, which is calculated by verifying the model's thinking process against an external knowledge base. This stage uses the `KnowRL_RLtrain_data_withknowledge.json` and `knowrl_RLdata.json` files.
76
 
77
  For complete details on the training configuration and hyperparameters, please refer to our [GitHub repository](https://github.com/zjunlp/KnowRL).
78
 
@@ -87,5 +89,4 @@ bibtex
87
  author={Ren, Baochang and Qiao, Shuofei and Yu, Wenhao and Chen, Huajun and Zhang, Ningyu},
88
  journal={arXiv preprint arXiv:2506.19807},
89
  year={2025}
90
- }
91
- ```
 
1
  ---
2
  license: mit
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
  ---
6
+
7
  <div align="center">
8
  <h1 align="center"> KnowRL </h1>
9
  <h3 align="center"> Exploring Knowledgeable Reinforcement Learning for Factuality </h3>
 
65
  ### Using `huggingface-cli`
66
  You can also download the model from the command line using `huggingface-cli`.
67
 
68
+ ```bash
 
69
  huggingface-cli download zjunlp/KnowRL-DeepSeek-R1-Distill-Qwen-7B --local-dir KnowRL-DeepSeek-R1-Distill-Qwen-7B
70
  ```
71
 
 
74
  The model's training process involves two distinct stages, using the data from the `zjunlp/KnowRL-Train-Data` dataset.
75
 
76
  * **Stage 1: Cold-Start SFT**: The base model undergoes supervised fine-tuning on the `knowrl_coldstart.json` dataset. This stage helps the model adopt a fact-based, slow-thinking response structure.
77
+ * **Stage 2: Knowledgeable Reinforcement Learning (RL)**: The SFT-tuned model is further trained using reinforcement learning (GRPO). The reward function combines a correctness reward with a factuality reward, which is calculated by verifying the model's thinking process against an external knowledge base. This stage uses the `KnowRL_RLtrain_data_withknowledge.json` and `knowrl_RLdata.json` files.
78
 
79
  For complete details on the training configuration and hyperparameters, please refer to our [GitHub repository](https://github.com/zjunlp/KnowRL).
80
 
 
89
  author={Ren, Baochang and Qiao, Shuofei and Yu, Wenhao and Chen, Huajun and Zhang, Ningyu},
90
  journal={arXiv preprint arXiv:2506.19807},
91
  year={2025}
92
+ }