Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.14302

about 22 hours ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published 4 days ago • 61
Reward Reasoning Model

Paper • 2505.14674 • Published 4 days ago • 30
Qwen3 Technical Report

Paper • 2505.09388 • Published 10 days ago • 146
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published 5 days ago • 70

Full Paper List

Ultra-Sparse Memory Network

Paper • 2411.12364 • Published Nov 19, 2024 • 24
Hyper-Connections

Paper • 2409.19606 • Published Sep 29, 2024 • 23
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Paper • 2411.03884 • Published Nov 6, 2024 • 29
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published Jan 28 • 31

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

about 11 hours ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 148
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 47
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 47

inference optimization

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 9
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 6
Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 30
Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 8

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 59
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 53
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 43
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 60

To read... eventually

A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics.

about 22 hours ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 128
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19, 2024 • 54
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6, 2024 • 15
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25, 2024 • 69

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 22
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 70

about 22 hours ago

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 20
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 12
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 14
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 49

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs