Token Classification
Safetensors
qwen2
nielsr HF Staff commited on
Commit
7566abc
·
verified ·
1 Parent(s): 0a96d22

Set library_name and pipeline tag

Browse files

This PR makes sure the Transformers library and question-answering pipeline are recognized for your model.

Files changed (1) hide show
  1. README.md +15 -7
README.md CHANGED
@@ -1,10 +1,11 @@
1
  ---
2
- license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen2.5-Math-7B
5
- pipeline_tag: token-classification
6
  datasets:
7
  - HuggingFaceH4/prm800k-trl-dedup
 
 
 
8
  ---
9
 
10
  > [!Warning]
@@ -31,8 +32,11 @@ datasets:
31
  > **PURE's PRM** is a process reward model typically used for offering feedback on the quality of reasoning and intermediate steps rather than generation.
32
 
33
  ### Prerequisites
34
- - Step Separation: We recommend using double line breaks ("\n\n") to separate individual steps within the solution.
35
- - Reward Computation: After each step, we insert a token "`\n`". For reward calculation, we extract the probability score of this token and subtract negative probabilities from positive probabilities, resulting in a reward value between -1 and 1. We regard steps with reward > 0 as correct, otherwise as incorrect.
 
 
 
36
 
37
  ### 🤗 Hugging Face Transformers
38
 
@@ -80,7 +84,8 @@ steps = [
80
  "To find the difference, subtract the number of white flamingos from the number of pink flamingos: (36 - 6 = 30). Therefore, at noon on Sunday, there were 30 more pink plastic flamingos out than white plastic flamingos. The answer is (\\boxed{30})."
81
  ]
82
 
83
- step_separator = "\n"
 
84
  step_separator_token = tokenizer(
85
  step_separator,
86
  add_special_tokens=False,
@@ -166,7 +171,8 @@ model = AutoModelForTokenClassification.from_pretrained(
166
  trust_remote_code=True,
167
  ).eval()
168
 
169
- step_separator = "\n"
 
170
  step_separator_token = tokenizer(
171
  step_separator,
172
  add_special_tokens=False,
@@ -190,7 +196,9 @@ for ds_item, ds_name in zip(ds, ds_names):
190
  return_tensors='pt',
191
  )['input_ids']
192
  for answer in tqdm(answers, desc="Processing answers"):
193
- steps = [i.rstrip() for i in answer.split("\n\n")]
 
 
194
  input_ids = question_ids.clone()
195
 
196
  score_ids = []
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen2.5-Math-7B
 
4
  datasets:
5
  - HuggingFaceH4/prm800k-trl-dedup
6
+ license: apache-2.0
7
+ pipeline_tag: question-answering
8
+ library_name: transformers
9
  ---
10
 
11
  > [!Warning]
 
32
  > **PURE's PRM** is a process reward model typically used for offering feedback on the quality of reasoning and intermediate steps rather than generation.
33
 
34
  ### Prerequisites
35
+ - Step Separation: We recommend using double line breaks ("
36
+
37
+ ") to separate individual steps within the solution之道。
38
+ - Reward Computation: After each step, we insert a token "`
39
+ `". For reward calculation, we extract the probability score of this token and subtract negative probabilities from positive probabilities, resulting in a reward value between -1 and 1. We regard steps with reward > 0 as correct, otherwise as incorrect.
40
 
41
  ### 🤗 Hugging Face Transformers
42
 
 
84
  "To find the difference, subtract the number of white flamingos from the number of pink flamingos: (36 - 6 = 30). Therefore, at noon on Sunday, there were 30 more pink plastic flamingos out than white plastic flamingos. The answer is (\\boxed{30})."
85
  ]
86
 
87
+ step_separator = "
88
+ "
89
  step_separator_token = tokenizer(
90
  step_separator,
91
  add_special_tokens=False,
 
171
  trust_remote_code=True,
172
  ).eval()
173
 
174
+ step_separator = "
175
+ "
176
  step_separator_token = tokenizer(
177
  step_separator,
178
  add_special_tokens=False,
 
196
  return_tensors='pt',
197
  )['input_ids']
198
  for answer in tqdm(answers, desc="Processing answers"):
199
+ steps = [i.rstrip() for i in answer.split("
200
+
201
+ ")]
202
  input_ids = question_ids.clone()
203
 
204
  score_ids = []