Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
dmariko
/
SmolLM-360M-Instruct-dpo-16k
like
0
TensorBoard
Safetensors
English
llama
trl
dpo
Generated from Trainer
License:
cc-by-nc-4.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
main
SmolLM-360M-Instruct-dpo-16k
Commit History
Update README.md
cf11bd8
verified
dmariko
commited on
Sep 12, 2024
Upload tokenizer
9c8fa35
verified
dmariko
commited on
Sep 12, 2024
Upload LlamaForCausalLM
6436edf
verified
dmariko
commited on
Sep 12, 2024
SmolLM-360M-Instruct-dpo-16k
4eea8a9
verified
dmariko
commited on
Sep 12, 2024
initial commit
b231ecd
verified
dmariko
commited on
Sep 11, 2024