Robby Milletich's picture

1 10 2

Robby Milletich

rmill040

AI & ML interests

None yet

Organizations

None yet

rmill040's activity

upvoted 2 papers 5 months ago

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5 • 10

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30

upvoted an article 5 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 364

upvoted 6 papers 9 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602

LiPO: Listwise Preference Optimization through Learning-to-Rank

Paper • 2402.01878 • Published Feb 2 • 19

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 48

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2 • 41

HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy

Paper • 2401.15207 • Published Jan 26 • 1

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59

upvoted a paper 11 months ago

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 25