Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
eipi1-0 's Collections
ASRModel
MusicModel
EmbeddingModel
VLData
VLModel
AISpaces
CodeModel
OCRModel
MathData
ReasoningModel
Phi4s
MultimodalModel
TTS Model
RolePlayModel
ReasoningData
RolePlayData
Awesome tiny LLMs
LM Preference datas
Remarkable LLMs
Code datas
LM datas
LM SFT datas
LLM RL papers
LLM Leaderboards
LLM Benchmark datas

LLM RL papers

updated Jan 19, 2024
Upvote
-

  • A General Theoretical Paradigm to Understand Learning from Human Preferences

    Paper • 2310.12036 • Published Oct 18, 2023 • 15
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs