klei1 commited on
Commit
96b3e70
·
verified ·
1 Parent(s): 6f04587

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -16
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: bleta-meditor-27b
3
  tags:
4
  - text-generation-inference
5
  - transformers
@@ -21,7 +21,7 @@ inference:
21
  # Gemma 3 27B GRPO Reasoning Model
22
 
23
  ## Model Description
24
- - **Developed by:** klei aliaj
25
  - **Model type:** Gemma 3 27B fine-tuned with GRPO for reasoning tasks
26
  - **License:** apache-2.0
27
  - **Finetuned from model:** Google's Gemma 3 27B instruction-tuned model
@@ -38,9 +38,6 @@ This model was fine-tuned using GRPO (Generative Rejection Policy Optimization),
38
  2. Produce correct mathematical solutions
39
  3. Show clear step-by-step reasoning processes
40
 
41
- ### Training Data
42
- The model was fine-tuned on the GSM8K dataset containing grade school math problems, teaching the model to break down problems, think step-by-step, and arrive at accurate solutions.
43
-
44
  ### Special Formatting
45
  The model has been trained to follow a specific reasoning format:
46
  - Working out/reasoning sections are enclosed within `<start_working_out>` and `<end_working_out>` tags
@@ -65,17 +62,6 @@ This model is available in two formats:
65
  - 5 sliding + 1 global attention pattern
66
  - 1024 sliding window attention
67
 
68
- ## System Prompt
69
-
70
- To get the best results from this model, use this system prompt:
71
-
72
- ```
73
- You are given a problem.
74
- Think about the problem and provide your working out.
75
- Place it between <start_working_out> and <end_working_out>.
76
- Then, provide your solution between <SOLUTION></SOLUTION>
77
- ```
78
-
79
  ## Limitations
80
  - While this model excels at reasoning tasks, particularly mathematical problems, it may still occasionally provide incorrect solutions for complex problems.
81
  - The model's performance might vary depending on problem complexity and wording.
 
1
  ---
2
+ base_model: gemma-3-27b-it
3
  tags:
4
  - text-generation-inference
5
  - transformers
 
21
  # Gemma 3 27B GRPO Reasoning Model
22
 
23
  ## Model Description
24
+ - **Developed by:** klei1
25
  - **Model type:** Gemma 3 27B fine-tuned with GRPO for reasoning tasks
26
  - **License:** apache-2.0
27
  - **Finetuned from model:** Google's Gemma 3 27B instruction-tuned model
 
38
  2. Produce correct mathematical solutions
39
  3. Show clear step-by-step reasoning processes
40
 
 
 
 
41
  ### Special Formatting
42
  The model has been trained to follow a specific reasoning format:
43
  - Working out/reasoning sections are enclosed within `<start_working_out>` and `<end_working_out>` tags
 
62
  - 5 sliding + 1 global attention pattern
63
  - 1024 sliding window attention
64
 
 
 
 
 
 
 
 
 
 
 
 
65
  ## Limitations
66
  - While this model excels at reasoning tasks, particularly mathematical problems, it may still occasionally provide incorrect solutions for complex problems.
67
  - The model's performance might vary depending on problem complexity and wording.