Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Julius-L 's Collections
inference acceleration
multimodal dataset
Generation
Long Context
Finetuning
Memory Efficient Training
Pretraining
Model Architecture
Model Merging
Sparsification
Quantization
LLM Technical Reports
Unseen Papers

Unseen Papers

updated Nov 1, 2024
Upvote
-

  • MiniPLM: Knowledge Distillation for Pre-Training Language Models

    Paper • 2410.17215 • Published Oct 22, 2024 • 16

  • LOGO -- Long cOntext aliGnment via efficient preference Optimization

    Paper • 2410.18533 • Published Oct 24, 2024 • 44

  • Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

    Paper • 2410.17243 • Published Oct 22, 2024 • 95

  • LongReward: Improving Long-context Large Language Models with AI Feedback

    Paper • 2410.21252 • Published Oct 28, 2024 • 18

  • COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

    Paper • 2410.19313 • Published Oct 25, 2024 • 19
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs