-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 15 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 10 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2404.02078
-
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 31 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 84 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 17 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 26
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 80 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 60 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 56
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 3 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 35 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 14
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
Locating and Editing Factual Associations in Mamba
Paper • 2404.03646 • Published • 3 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 108
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
PointInfinity: Resolution-Invariant Point Diffusion Models
Paper • 2404.03566 • Published • 13 -
MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance
Paper • 2404.08252 • Published • 5 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 23