Huiqiang Jiang

iofu728

13 26 17

https://hqjiang.com/

AI & ML interests

None yet

Recent Activity

authored a paper 27 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

commentedon a paper 27 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

upvoted a paper 27 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

View all activity

Organizations

upvoted a paper 27 days ago

Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

Paper • 2606.12370 • Published 29 days ago • 21

upvoted a paper 7 months ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 107

upvoted 3 papers about 1 year ago

Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17, 2025 • 121

RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Paper • 2505.02922 • Published May 5, 2025 • 29

MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention

Paper • 2504.16083 • Published Apr 22, 2025 • 8

upvoted 7 papers over 1 year ago

Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28, 2025 • 36

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published Jan 23, 2025 • 48

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 289

upvoted an article almost 2 years ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf

•

Sep 18, 2024

• 281

upvoted 2 papers almost 2 years ago

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 43

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Paper • 2408.11049 • Published Aug 20, 2024 • 14

upvoted 3 articles almost 2 years ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

neuralink, lvwerra, thomwolf

•

Aug 14, 2024

• 76

Article

RegMix: Data Mixture as Regression for Language Model Pre-training

SivilTaram

•

Jul 11, 2024

• 16

Article

MInference 1.0: 10x Faster Million Context Inference with a Single GPU

liyucheng

•

Jul 11, 2024

• 14

upvoted a paper about 2 years ago

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Paper • 2407.02490 • Published Jul 2, 2024 • 26

upvoted a paper over 2 years ago

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19, 2024 • 25

Huiqiang Jiang

AI & ML interests

Recent Activity

Organizations

iofu728's activity

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

A failed experiment: Infini-Attention, and why we should keep trying?

RegMix: Data Mixture as Regression for Language Model Pre-training

MInference 1.0: 10x Faster Million Context Inference with a Single GPU