AssistantsLab
/

SmolLM2-135M-humanized

Text Generation

text-generation-inference

Model card Files Files and versions Community

Michielo commited on Jan 29

Commit

2fff865

·

verified ·

1 Parent(s): 9eec746

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ tags:
 **SmolLM2-135M-Humanized** is a fine-tuned version of the [SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) model, optimized using the Direct Preference Optimization (DPO) method. To do this we used the "[Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)" from [Human-Like LLMs](https://huggingface.co/HumanLLMs).
-Unlike traditional fine-tuning approaches that aim to improve specific benchmarks or metrics, DPO fine-tuning focuses on aligning the model's behavior with human preferences. This process enhances the model's ability to generate more natural, human-like responses, making it particularly well-suited for conversational applications.
 By emphasizing response quality and relatability, SmolLM2-135M-Humanized is designed to deliver an engaging and intuitive user experience in dialogue-based scenarios.

 **SmolLM2-135M-Humanized** is a fine-tuned version of the [SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) model, optimized using the Direct Preference Optimization (DPO) method. To do this we used the "[Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)" from [Human-Like LLMs](https://huggingface.co/HumanLLMs).
+Unlike traditional fine-tuning datasets that aim to improve specific benchmarks or metrics, the Human-Like-DPO-Dataset focuses on aligning the model's behavior with human preferences. This process enhances the model's ability to generate more natural, human-like responses, making it particularly well-suited for conversational applications.
 By emphasizing response quality and relatability, SmolLM2-135M-Humanized is designed to deliver an engaging and intuitive user experience in dialogue-based scenarios.