Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Nscale
fal
SambaNova
Fireworks
Cohere
Nebius AI Studio
Together AI
Hyperbolic
Replicate
Cerebras
Novita
HF Inference API
Misc
Reset Misc
ppo
Eval Results
Inference Endpoints
text-generation-inference
8-bit precision
custom_code
Misc with no match
Merge
4-bit precision
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
2,311
Full-text search
Edit filters
Sort: Trending
Active filters:
ppo
Clear all
ajagota71/pythia-410m-detox-irl-rlhf-seed-300
Reinforcement Learning
•
Updated
7 days ago
•
1
ajagota71/pythia-410m-detox-irl-rlhf-seed-400
Reinforcement Learning
•
Updated
7 days ago
•
1
S-Chaves/ppo-from-scratch-LunarLander-v2
Reinforcement Learning
•
Updated
5 days ago
Arrebol-yzq/RLP_llm_inductive_model
Reinforcement Learning
•
Updated
5 days ago
•
2
jcorblaz/LunarLander2
Reinforcement Learning
•
Updated
5 days ago
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-40
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-120
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-140
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-160
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-180
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-70m-fb-detox-checkpoint-epoch-200
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-70m-fb-detox
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-160m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-20
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-40
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-60
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-80
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-100
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-120
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-140
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-160
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-180
Reinforcement Learning
•
Updated
3 days ago
•
1
ajagota71/pythia-410m-fb-detox-checkpoint-epoch-200
Reinforcement Learning
•
Updated
3 days ago
ajagota71/pythia-410m-fb-detox
Reinforcement Learning
•
Updated
3 days ago
Previous
1
...
75
76
77
78
Next