Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
estnafinema0
/
smolLM-variation-dpo
like
0
Text Generation
Transformers
Safetensors
English
llama
DPO
RLHF
Fine-tuning
SmolLM
Direct Preference Optimization
conversational
text-generation-inference
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
smolLM-variation-dpo
Commit History
Update README.md
16d610a
verified
estnafinema0
commited on
Mar 30
Upload tokenizer
d80113e
verified
estnafinema0
commited on
Mar 30
Upload LlamaForCausalLM
2208881
verified
estnafinema0
commited on
Mar 30
initial commit
5e35805
verified
estnafinema0
commited on
Mar 30