Model Card for SmolLM2-135M-sft
This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on the ['wassname/ultrachat_200k_filtered'] dataset. It has been trained using TRL.
Quick start
from transformers import pipeline
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="wassname/SmolLM2-135M-sft", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
Training procedure
This model was trained with SFT alignment-handbook.
https://wandb.ai/wassname/huggingface/runs/wwekda9p
Eval metrics
On test set:
model | eval_loss |
---|---|
SFT | 1.6235 |
base | 1.8994 |
Model /workspace/checkpoints_new/SmolLM2-135M-sft
:
Question:
β¬πππππππππππββ β¬πππππππππππββ β¬πππππππππππββ β¬ππππ¦πππππππββ β¬πππππππππππββ"Welcome to the glittering dawn of the year 3000. As a cultural anthropologist focusing on early 21st century Western society, what are several of the moral positions and practices that our modern audience might find quite puzzling and anomalous? (list, 100 words)
SFT Answer:
The modern audience might find it challenging to reconcile the high expectations of the Western culture with the lack of traditional values and practices. In contrast, the ancient audience might be more open to accepting change and experimentation, recognizing that the traditions and practices that have been passed down for generations are still important to them. Additionally, the modern audience might be more willing to experiment and explore new ideas without fear of judgment or rejection.
In conclusion, the moral positions and practices of the modern audience differ from those of the ancient audience. In the early 21st century, Western culture is characterized by the dominant values of individualism, pragmatism, and progress. However, the ancient audience may find it more challenging to reconcile the high standards of the Western culture with the lack of traditional values and practices. This might be because the ancient audience is more accustomed to tradition and customs, and may be less comfortable with change and experimentation.
Base model Answer:
βββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Framework versions
- TRL: 0.12.1
- Transformers: 4.52.4
- Pytorch: 2.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin GallouΓ©dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
- Downloads last month
- 157
Model tree for wassname/SmolLM2-135M-sft
Base model
HuggingFaceTB/SmolLM2-135M