Vaibhavi Singh's picture

3

Vaibhavi Singh

contactvaibhavi

·

https://www.vaibhavisingh.com/

AI & ML interests

NLP

Recent Activity

upvoted a collection 4 days ago

upvoted a collection 18 days ago

OLMo-1B-as_fm3_tg_omi2

upvoted a collection 18 days ago

OLMo-1B-as_fm3_tg_omi1_omi2

View all activity

Organizations

upvoted a collection 4 days ago

Reward Models

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 5 days ago • 11

upvoted 2 collections 18 days ago

OLMo-1B-as_fm3_tg_omi2

OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, and OpenMathInstruct2. Includes checkpoints from doing PPO using GSM8K train. • 25 items • Updated 18 days ago • 1

OLMo-1B-as_fm3_tg_omi1_omi2

OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, OMI1, and OMI2. Includes checkpoints from doing PPO using GSM8K train. • 25 items • Updated 18 days ago • 1