Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
poonyZ 's Collections
omni
T2I
agi
fancy
vlm eval
speech lm
vlm data
video LM
VLM
llm

llm

updated Jan 9
Upvote
-

  • Training Large Language Models to Reason in a Continuous Latent Space

    Paper • 2412.06769 • Published Dec 9, 2024 • 87

  • Byte Latent Transformer: Patches Scale Better Than Tokens

    Paper • 2412.09871 • Published Dec 13, 2024 • 106

  • Qwen2.5 Technical Report

    Paper • 2412.15115 • Published Dec 19, 2024 • 368

  • YuLan-Mini: An Open Data-efficient Language Model

    Paper • 2412.17743 • Published Dec 23, 2024 • 67

  • ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

    Paper • 2412.14711 • Published Dec 19, 2024 • 16

  • Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

    Paper • 2412.17739 • Published Dec 23, 2024 • 42

  • REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

    Paper • 2501.03262 • Published Jan 4 • 100
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs