Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
jzwong 's Collections
MLLM
LLM
LLM-RL
Agent-RL
Novel
SYS
Survey

LLM

updated 3 days ago
Upvote
-

  • REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

    Paper • 2501.03262 • Published Jan 4 • 99

  • MiniMax-01: Scaling Foundation Models with Lightning Attention

    Paper • 2501.08313 • Published Jan 14 • 291

  • Towards Best Practices for Open Datasets for LLM Training

    Paper • 2501.08365 • Published Jan 14 • 63

  • Qwen2.5-1M Technical Report

    Paper • 2501.15383 • Published Jan 26 • 71

  • SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

    Paper • 2502.02737 • Published Feb 4 • 230

  • Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM

    Paper • 2502.06635 • Published Feb 10 • 4

  • Predictive Data Selection: The Data That Predicts Is the Data That Teaches

    Paper • 2503.00808 • Published Mar 2 • 57

  • Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

    Paper • 2503.22230 • Published Mar 28 • 44

  • WorldPM: Scaling Human Preference Modeling

    Paper • 2505.10527 • Published 12 days ago • 33

  • Qwen3 Technical Report

    Paper • 2505.09388 • Published 13 days ago • 158

  • Model Merging in Pre-training of Large Language Models

    Paper • 2505.12082 • Published 10 days ago • 35
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs