xansar
's Collections
daily_paper
updated
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Paper
•
2311.00059
•
Published
•
20
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
•
2403.04642
•
Published
•
51
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper
•
2403.07816
•
Published
•
42
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
60
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
•
2403.15042
•
Published
•
28
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
•
2403.18421
•
Published
•
24
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
42
Advancing LLM Reasoning Generalists with Preference Trees
Paper
•
2404.02078
•
Published
•
47
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
98
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
•
2404.04167
•
Published
•
14
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
Training Strategies
Paper
•
2404.06395
•
Published
•
23
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
94
Pre-training Small Base LMs with Fewer Tokens
Paper
•
2404.08634
•
Published
•
36
Learn Your Reference Model for Real Good Alignment
Paper
•
2404.09656
•
Published
•
87
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of
Instruction Data
Paper
•
2404.12195
•
Published
•
12
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
Series
Paper
•
2405.19327
•
Published
•
50