Set library_name and pipeline tag

Browse files

This PR makes sure the Transformers library and question-answering pipeline are recognized for your model.

Files changed (1) hide show

README.md +15 -7

README.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
-license: apache-2.0
 base_model:
 - Qwen/Qwen2.5-Math-7B
-pipeline_tag: token-classification
 datasets:
 - HuggingFaceH4/prm800k-trl-dedup
 ---
 > [!Warning]
@@ -31,8 +32,11 @@ datasets:
 > **PURE's PRM** is a process reward model typically used for offering feedback on the quality of reasoning and intermediate steps rather than generation.
 ### Prerequisites
-- Step Separation: We recommend using double line breaks ("\n\n") to separate individual steps within the solution.
-- Reward Computation: After each step, we insert a token "`\n`". For reward calculation, we extract the probability score of this token and subtract negative probabilities from positive probabilities, resulting in a reward value between -1 and 1. We regard steps with reward > 0 as correct, otherwise as incorrect.
 ### 🤗 Hugging Face Transformers
@@ -80,7 +84,8 @@ steps = [
     "To find the difference, subtract the number of white flamingos from the number of pink flamingos: (36 - 6 = 30). Therefore, at noon on Sunday, there were 30 more pink plastic flamingos out than white plastic flamingos. The answer is (\\boxed{30})."
 ]
-step_separator = "\n"
 step_separator_token = tokenizer(
     step_separator,
     add_special_tokens=False,
@@ -166,7 +171,8 @@ model = AutoModelForTokenClassification.from_pretrained(
     trust_remote_code=True,
 ).eval()
-step_separator = "\n"
 step_separator_token = tokenizer(
     step_separator,
     add_special_tokens=False,
@@ -190,7 +196,9 @@ for ds_item, ds_name in zip(ds, ds_names):
             return_tensors='pt',
         )['input_ids']
         for answer in tqdm(answers, desc="Processing answers"):
-            steps = [i.rstrip() for i in answer.split("\n\n")]
             input_ids = question_ids.clone()
             score_ids = []

 ---
 base_model:
 - Qwen/Qwen2.5-Math-7B
 datasets:
 - HuggingFaceH4/prm800k-trl-dedup
+license: apache-2.0
+pipeline_tag: question-answering
+library_name: transformers
 ---
 > [!Warning]
 > **PURE's PRM** is a process reward model typically used for offering feedback on the quality of reasoning and intermediate steps rather than generation.
 ### Prerequisites
+- Step Separation: We recommend using double line breaks ("
+") to separate individual steps within the solution之道。
+- Reward Computation: After each step, we insert a token "`
+`". For reward calculation, we extract the probability score of this token and subtract negative probabilities from positive probabilities, resulting in a reward value between -1 and 1. We regard steps with reward > 0 as correct, otherwise as incorrect.
 ### 🤗 Hugging Face Transformers
     "To find the difference, subtract the number of white flamingos from the number of pink flamingos: (36 - 6 = 30). Therefore, at noon on Sunday, there were 30 more pink plastic flamingos out than white plastic flamingos. The answer is (\\boxed{30})."
 ]
+step_separator = "
+"
 step_separator_token = tokenizer(
     step_separator,
     add_special_tokens=False,
     trust_remote_code=True,
 ).eval()
+step_separator = "
+"
 step_separator_token = tokenizer(
     step_separator,
     add_special_tokens=False,
             return_tensors='pt',
         )['input_ids']
         for answer in tqdm(answers, desc="Processing answers"):
+            steps = [i.rstrip() for i in answer.split("
+")]
             input_ids = question_ids.clone()
             score_ids = []