Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Jiarui Yao's picture
6

Jiarui Yao

FlippyDora
·

AI & ML interests

None yet

Recent Activity

updated a model about 22 hours ago
FlippyDora/slimpajama-train-1280k
published a model about 22 hours ago
FlippyDora/slimpajama-train-1280k
updated a model 2 days ago
chain-of-experts/64ept-4tpk-2itr-metamathqa-10k
View all activity

Organizations

RandomSampling's profile picture Embodied Reasoning Agent's profile picture EM-RAFT's profile picture Micro-RM's profile picture era-temporary's profile picture FANS - Formal Answer Selection Using Lean4's profile picture DPO-RM's profile picture CoE - Chain of Experts's profile picture

FlippyDora's activity

upvoted 2 papers 5 days ago

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published 26 days ago • 16

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published 6 days ago • 21
upvoted a collection 11 days ago

Qwen3

Collection
37 items • Updated 2 days ago • 557
upvoted 2 papers 19 days ago

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published 20 days ago • 33

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published 24 days ago • 43
upvoted a paper 2 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 84
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs