|
--- |
|
license: gemma |
|
tags: |
|
- gemma3 |
|
- gemma |
|
- google |
|
pipeline_tag: image-text-to-text |
|
library_name: transformers |
|
base_model: |
|
- google/gemma-3-27b-it-qat-q4_0-unquantized |
|
--- |
|
|
|
<p align="left"> |
|
<img width="65%" src="Fornax.jpg"> |
|
</p> |
|
|
|
### Gemma 3 27B V4 Fornax |
|
|
|
Gemma Fornax is a distillation of the updated R1 05/28 onto Gemma 3 27B, with a particualar focus on timely and generalizable reasoning beyond coding and math. |
|
Most other open source thinking models, especially on the smaller side, fail to generalize their reasoning to tasks other than coding or math due to an overly large focus on |
|
GRPO zero for CoT which only generalizes for coding and math. |
|
|
|
Instead of using GRPO, this model aims to SFT a wide variety of high quality, diverse reasoning traces from Deepseek R1 05/28 onto Gemma 3 to force the model to learn to effectively |
|
generalize its reasoning capabilites to a large number of tasks as an extension of the LiMO paper's approach to Math/Coding CoT. |
|
|
|
Varying CoT length in conjuction with explicit noise regularization during training also prevents the characteristic length overfitting of GRPO, which tends to manifest as waffling, where the model reasons to a set length even when it has already reached an answer. |
|
|
|
|
|
## Recommended Settings |
|
|
|
Temp .7 + Nsigma 1 |
|
|
|
## Special Thanks: |
|
|
|
Google for open sourcing the excellent Gemma 3 model line. |