-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 95 -
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 68 -
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 24 -
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Paper • 2406.01574 • Published • 49
Daeseong Kim
dkimds
·
AI & ML interests
RL, LLM, RLHF and so on.
Recent Activity
liked
a dataset
7 days ago
valurank/Topic_Classification
liked
a model
3 months ago
weoweke23/btc-model-predictor-trainedwith-brownian-noise
updated
a collection
6 months ago
DS' Daily paper
Organizations
None yet