sm2.png

SmolLM2-Rethink-135M

SmolLM2-Rethink-135M is an experimental lightweight model trained on the Celestia3-DeepSeek-R1-0528 reasoning dataset. Based on the SmolLM2-135M-Instruct architecture, this model is specifically optimized for reasoning, structured outputs, and efficient small-scale deployment. Despite its compact size (135M parameters), it demonstrates strong capabilities in logical deduction, conversational coherence, and lightweight inference tasks.


Key Highlights

  1. Compact & Efficient Lightweight architecture (135M) suitable for fast inference, mobile applications, and edge deployment.

  2. Reasoning-Centric Training Fine-tuned on high-quality reasoning and instruction datasets like Celestia3-DeepSeek-R1-0528, focusing on multi-step logical thinking.

  3. Low-Resource Optimization Designed to run effectively on CPUs or single-GPU setups with minimal memory footprint.

  4. Structured Outputs Supports generation of clean, structured content including lists, steps, tables, and JSON-like responses.


Quickstart with πŸ€— Transformers

%%capture
!pip install transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "prithivMLmods/SmolLM2-Rethink-135M"
device = "cuda"  # or "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

messages = [{"role": "user", "content": "What is gravity?"}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False)
print(input_text)

inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(
    inputs,
    max_new_tokens=1024,
    temperature=0.2,
    top_p=0.9,
    do_sample=True
)

print(tokenizer.decode(outputs[0]))

Intended Use

  • Instruction Following & QA Good for answering simple questions, following short instructions, and general user interactions.

  • Educational Tools Suitable for lightweight tutoring bots or classroom assistants on low-compute setups.

  • Reasoning Tasks Performs well on logic puzzles, multi-step reasoning, and chain-of-thought queries.

  • Prototype Agents & Microservices Can be deployed in memory-efficient environments or as modular AI components.


Limitations

  1. Limited Knowledge Capacity Due to small parameter size, lacks the depth and breadth of large-scale models.

  2. Short-Term Context Handling Performs best with short to moderate-length prompts; lacks extended context support.

  3. Creative Generation Limitations Output may lack diversity or depth in open-ended storytelling or imaginative tasks.

  4. Token Budget Smaller output range; optimized for shorter and structured completions.

  5. Basic Multilingual Support Some support for multilingual input, but less accurate than larger multilingual models.

Downloads last month
11
Safetensors
Model size
135M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for prithivMLmods/SmolLM2-Rethink-135M

Finetuned
(170)
this model
Quantizations
1 model

Dataset used to train prithivMLmods/SmolLM2-Rethink-135M

Collection including prithivMLmods/SmolLM2-Rethink-135M