HuangXinBa commited on
Commit
bc15fbb
·
verified ·
1 Parent(s): 8cb0da9

Upload LlamaForCausalLM

Browse files
Files changed (1) hide show
  1. README.md +10 -9
README.md CHANGED
@@ -1,19 +1,20 @@
1
-
2
  ---
3
  license: apache-2.0
4
  language: en
5
  tags:
6
- - text-generation
7
- - causal-lm
8
- - reinforcement-learning
9
- - GRPO
10
- - instruction-tuning
11
- - chain-of-thought
 
 
12
  datasets:
13
- - gsm8k
14
  pipeline_tag: text-generation
15
  widget:
16
- - text: "What is 27 plus 16? Let's think step by step."
17
  ---
18
 
19
  # GRPO: Finetuned Causal Language Model using Generalized Reinforcement Policy Optimization
 
 
1
  ---
2
  license: apache-2.0
3
  language: en
4
  tags:
5
+ - text-generation
6
+ - causal-lm
7
+ - reinforcement-learning
8
+ - GRPO
9
+ - instruction-tuning
10
+ - chain-of-thought
11
+ - trl
12
+ - grpo
13
  datasets:
14
+ - gsm8k
15
  pipeline_tag: text-generation
16
  widget:
17
+ - text: What is 27 plus 16? Let's think step by step.
18
  ---
19
 
20
  # GRPO: Finetuned Causal Language Model using Generalized Reinforcement Policy Optimization