Papers from the NICS-EFFALG Team - a nics-efc Collection

nics-efc 's Collections

R2R

Papers from the NICS-EFFALG Team

Papers from the NICS-EFFALG Team

updated Jun 11

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Paper • 2505.21600 • Published May 27 • 71
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching

Paper • 2412.17153 • Published Dec 22, 2024 • 40
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

Paper • 2307.15337 • Published Jul 28, 2023 • 38
DiTFastAttn: Attention Compression for Diffusion Transformer Models

Paper • 2406.08552 • Published Jun 12, 2024 • 26
Can LLMs Learn by Teaching? A Preliminary Study

Paper • 2406.14629 • Published Jun 20, 2024 • 20
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression

Paper • 2406.14909 • Published Jun 21, 2024 • 16
A Survey on Efficient Inference for Large Language Models

Paper • 2404.14294 • Published Apr 22, 2024 • 3
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Paper • 2406.02540 • Published Jun 4, 2024 • 3
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Paper • 2405.17873 • Published May 28, 2024 • 3
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models

Paper • 2501.01986 • Published Dec 30, 2024 • 1
Evaluating Quantized Large Language Models

Paper • 2402.18158 • Published Feb 28, 2024