--- base_model: HuggingFaceTB/SmolLM2-360M datasets: - wassname/ultrachat_200k_filtered library_name: transformers model_name: SmolLM2-360M-sft tags: - generated_from_trainer - alignment-handbook licence: license --- # Model Card for SmolLM2-360M-sft This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-360M](https://huggingface.co/HuggingFaceTB/SmolLM2-360M) on the [['wassname/ultrachat_200k_filtered']](https://huggingface.co/datasets/['wassname/ultrachat_200k_filtered']) dataset. It has been trained using [TRL](https://github.com/huggingface/trl). ## Quick start ```python from transformers import pipeline question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?" generator = pipeline("text-generation", model="None", device="cuda") output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0] print(output["generated_text"]) ``` ## Training procedure [Visualize in Weights & Biases](https://wandb.ai/wassname/huggingface/runs/gs4a36gl) This model was trained with SFT. - SFT loss "eval_loss": 1.4015671014785767, - base model loss "eval_loss": 1.6745600700378418, ## Model `/workspace/checkpoints_new/SmolLM2-360M-sft`: ### Question: Anon. If you had a time machine, but could only go to the past or the lol-future once and never return, which would you choose and why?!? ### SFT Answer: I would choose the past, because I believe that it is the most exciting and mysterious place in the universe. The past is filled with so much wonder and mystery, and it's hard to imagine what's to come in the lol-future. I find the past so fascinating because it's the only place where I can experience the thrill of a perfect day, without having to worry about the consequences of my actions. So, I would choose the past for sure! ### Base Model Answer: Anon. I would choose the past. I remember being able to do all the things my friends could do on the internet, and it was exciting. I also remember people who had their own personal websites and blogs. And then I had my own blog. I have a lot of feelings about the internet and the internet is so easy. I just want to feel that I'm contributing something to it, and I haven't been able to do so. I want to have my own space on the internet. I have always been fascinated by the internet, and I wanted to create a place where I could share my stuff and have it be there for everyone else. Anon. Anon. "I want to be able to write a story in the style of an action movie and have it be a comic book. I want to have a character who is really powerful and can do anything and has superpowers and really is a superhero. I want to have a character who is really good at a lot of things and has a lot of power and has got a lot of skills and special abilities. I want to have a character who is really good at a lot of things and who can do everything and has a lot of power and who has a lot of skills and special ### Framework versions - TRL: 0.12.1 - Transformers: 4.52.4 - Pytorch: 2.7.0 - Datasets: 3.6.0 - Tokenizers: 0.21.1 ## Citations Cite TRL as: ```bibtex @misc{vonwerra2022trl, title = {{TRL: Transformer Reinforcement Learning}}, author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec}, year = 2020, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {\url{https://github.com/huggingface/trl}} } ```