File size: 3,288 Bytes
ff0ec39 c00d933 501e1d4 c00d933 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
license: apache-2.0
datasets:
- sequelbox/Celestia3-DeepSeek-R1-0528
base_model:
- HuggingFaceTB/SmolLM2-360M-Instruct
library_name: transformers
language:
- en
pipeline_tag: text-generation
tags:
- trl
- text-generation-inference
- r1
- re-think
---

# **SmolLM2-Rethink-360M**
> **SmolLM2-Rethink-360M** is an experimental lightweight reasoning model trained on the **Celestia3-DeepSeek-R1-0528** dataset. Built on top of the **SmolLM2-135M-Instruct** architecture and scaled to 360M parameters, it is designed to enhance lightweight reasoning, logical deduction, and structured response generation—all while maintaining efficiency for resource-constrained environments.
---
## **Key Highlights**
1. **Compact Yet Powerful**
With 360M parameters, the model balances performance and efficiency, offering solid reasoning capabilities with fast inference speeds.
2. **Reasoning-Oriented Training**
Fine-tuned on instruction-tuned datasets like **Celestia3-DeepSeek-R1-0528**, optimized for logical step-by-step thinking.
3. **Optimized for Edge & Research**
Usable on mid-range GPUs or CPU environments, making it ideal for experimentation, teaching, and lightweight deployment.
4. **Structured Generation Support**
Capable of outputting well-organized content such as JSON, lists, workflows, and tabular formats.
---
## **Quickstart with 🤗 Transformers**
```python
%%capture
!pip install transformers
```
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "prithivMLmods/SmolLM2-Rethink-360M"
device = "cuda" # or "cpu"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
messages = [{"role": "user", "content": "What is gravity?"}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False)
print(input_text)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(
inputs,
max_new_tokens=1024,
temperature=0.2,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0]))
```
---
## **Intended Use**
* **Lightweight Reasoning Tasks**
Suitable for compact agents needing reasoning abilities without high compute requirements.
* **Educational & Research Assistants**
Ideal for logic tutors, student aides, or research prototypes.
* **Instruction Following & Structured QA**
Excels in scenarios requiring concise, step-by-step or well-formatted responses.
* **Microservices & Embedded AI**
Can be embedded in systems with modest hardware, enabling distributed or modular AI.
---
## **Limitations**
1. **Knowledge Scope**
Smaller models naturally have less factual coverage compared to large-scale LLMs.
2. **Context Length**
Best used with shorter prompts and outputs due to token and memory constraints.
3. **Variability in Creative Tasks**
Less suited for imaginative writing or nuanced creative expression.
4. **Limited Real-World Awareness**
Model does not have real-time or post-training data awareness.
5. **Prompt Sensitivity**
Outputs can vary based on phrasing; best results come from clear, guided prompts. |