NL2Lean

non-profit

AI & ML interests

None defined yet.

Recent Activity

freesunshine0316 authored a paper 30 days ago

The Trickle-down Impact of Reward (In-)consistency on RLHF

freesunshine0316 authored a paper 30 days ago

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

freesunshine0316 authored a paper 30 days ago

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

View all activity

models 0

None public yet

datasets 0

None public yet