Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

tyzhu
/
nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-r

Safetensors
Model card Files Files and versions Community
nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-r / critic
Ctrl+K
Ctrl+K
  • 1 contributor
History: 1 commit
tyzhu's picture
tyzhu
Uploading folder nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-rouge5 to hf tyzhu/nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rat time 2025-05-30 14:10:45
8d88aec verified 2 months ago
  • global_step_100
    Uploading folder nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-rouge5 to hf tyzhu/nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rat time 2025-05-30 14:10:45 2 months ago
  • global_step_200
    Uploading folder nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-rouge5 to hf tyzhu/nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rat time 2025-05-30 14:10:45 2 months ago
  • global_step_300
    Uploading folder nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rouge-rouge5 to hf tyzhu/nq_wikipedia_recite2-r1-ppo-qwen2.5-3b-it-em-warmup-0.05-rat time 2025-05-30 14:10:45 2 months ago