klei1
/

bleta-meditor-27b

Text Generation

text-generation-inference

Model card Files Files and versions Community

klei1 commited on Mar 23

Commit

96b3e70

·

verified ·

1 Parent(s): 6f04587

Update README.md

Files changed (1) hide show

README.md +2 -16

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-base_model: bleta-meditor-27b
 tags:
 - text-generation-inference
 - transformers
@@ -21,7 +21,7 @@ inference:
 # Gemma 3 27B GRPO Reasoning Model
 ## Model Description
-- **Developed by:** klei aliaj
 - **Model type:** Gemma 3 27B fine-tuned with GRPO for reasoning tasks
 - **License:** apache-2.0
 - **Finetuned from model:** Google's Gemma 3 27B instruction-tuned model
@@ -38,9 +38,6 @@ This model was fine-tuned using GRPO (Generative Rejection Policy Optimization),
 2. Produce correct mathematical solutions
 3. Show clear step-by-step reasoning processes
-### Training Data
-The model was fine-tuned on the GSM8K dataset containing grade school math problems, teaching the model to break down problems, think step-by-step, and arrive at accurate solutions.
 ### Special Formatting
 The model has been trained to follow a specific reasoning format:
 - Working out/reasoning sections are enclosed within `<start_working_out>` and `<end_working_out>` tags
@@ -65,17 +62,6 @@ This model is available in two formats:
 - 5 sliding + 1 global attention pattern
 - 1024 sliding window attention
-## System Prompt
-To get the best results from this model, use this system prompt:
-```
-You are given a problem.
-Think about the problem and provide your working out.
-Place it between <start_working_out> and <end_working_out>.
-Then, provide your solution between <SOLUTION></SOLUTION>
-```
 ## Limitations
 - While this model excels at reasoning tasks, particularly mathematical problems, it may still occasionally provide incorrect solutions for complex problems.
 - The model's performance might vary depending on problem complexity and wording.

 ---
+base_model: gemma-3-27b-it
 tags:
 - text-generation-inference
 - transformers
 # Gemma 3 27B GRPO Reasoning Model
 ## Model Description
+- **Developed by:** klei1
 - **Model type:** Gemma 3 27B fine-tuned with GRPO for reasoning tasks
 - **License:** apache-2.0
 - **Finetuned from model:** Google's Gemma 3 27B instruction-tuned model
 2. Produce correct mathematical solutions
 3. Show clear step-by-step reasoning processes
 ### Special Formatting
 The model has been trained to follow a specific reasoning format:
 - Working out/reasoning sections are enclosed within `<start_working_out>` and `<end_working_out>` tags
 - 5 sliding + 1 global attention pattern
 - 1024 sliding window attention
 ## Limitations
 - While this model excels at reasoning tasks, particularly mathematical problems, it may still occasionally provide incorrect solutions for complex problems.
 - The model's performance might vary depending on problem complexity and wording.