liumy2010 's Collections
UFT

UFT

UFT: Unifying Supervised and Reinforcement Fine-Tuning