---
base_model:
- google/gemma-3-1b-it
tags:
- text-generation-inference
- transformers
- unsloth
- GRPO
- conversational
- gemma3_text
- reasoning
license: apache-2.0
language:
- en
datasets:
- NuclearAi/HyperThink-Mini-50K
---

# About Model

- **Developed by:** NuclearAi
- **License:** apache-2.0
- **Finetuned from model :** google/gemma-3-1b-it

**Gemma** is a family of lightweight, state-of-the-art open models from Google, built using the same research and technology as the **Gemini** models. However, Gemma lacks in the **reasoning** capabilities, making it less advanced compared to some other models.

At **Nuclear AI**, we enhance Gemma’s abilities by leveraging **GRPO** and providing it with a specialized dataset to improve its reasoning skills. Our previous version [Testing] of thinking model of Gemma3-1B, have used only 150 rows of high-quality dataset , **but this time we finetuned it on even more dataset. we trained it on 5000 Rows of High-Quality dataset which takes around 70 minutes**.

We would love to hear your feedback so we can work on fine-tuning a larger version with more steps and greater computational power.

---


## Installing Libraries

```python
# 1. Install the specific Gemma 3 compatible transformers
pip install --no-deps git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

# 2. Install Unsloth (adjust based on your environment - e.g., remove [colab-new] if not on Colab)
pip install "unsloth[colab-new]@git+https://github.com/unslothai/unsloth.git"

# 3. Install PyTorch (select command based on your CUDA version from https://pytorch.org/)
# Example for CUDA 12.1:
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Example for CPU only:
# pip install torch torchvision torchaudio

# 4. Install accelerate and bitsandbytes
pip install accelerate bitsandbytes
```

## Code To Run

```python
import torch
from unsloth import FastModel
from transformers import TextStreamer


# 1. Model and Tokenizer Loading
max_seq_length = 1024
model_name = "NuclearAi/Nuke_X_Gemma3_1B_Reasoner_v1.0"

print(f"Loading model: {model_name}...")

model, tokenizer = FastModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = None,         # Let Unsloth choose the best dtype (float16, bf16, float32)
    load_in_4bit = False, # Set to True if you want 4-bit quantization
    device_map = "auto",  # Automatically use GPU if available
)
print("Model loaded.")


# 2. Define Prompt Structure
reasoning_start = "<think>"
reasoning_end   = "</think>"
solution_start = "<response>"
solution_end = "</response>"


system_prompt = \
f"""You are given a problem.
Think about the problem and provide your working out.
Place it between {reasoning_start} and {reasoning_end}.
Then, provide your solution between {solution_start}{solution_end}"""

  
# 3. User Input
user_question = "Write a short story about a cat who learns to fly." # Try another question

  
# 4. Format Input for Chat Model
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user",   "content": user_question},
]

text_input = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True # Important for generation
)

  
# 5. Tokenize and Prepare for Generation
device = model.device if hasattr(model, 'device') else ('cuda' if torch.cuda.is_available() else 'cpu')
inputs = tokenizer([text_input], return_tensors="pt").to(device)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)


# 6. Generate Response
print("\n--- Model Response ---")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        streamer=streamer,
        max_new_tokens=1024,
        temperature=0.7,
        top_p=0.9,
        top_k=50,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
print("\n--- End of Response ---")
```

---


Thank you for your support !

**Jay Shree Ram 🚩🚩**