nics-efc
's Collections
Papers from the NICS-EFFALG Team
updated
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large
Model Token Routing
Paper
•
2505.21600
•
Published
•
70
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models
with Flow Matching
Paper
•
2412.17153
•
Published
•
40
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper
•
2307.15337
•
Published
•
38
DiTFastAttn: Attention Compression for Diffusion Transformer Models
Paper
•
2406.08552
•
Published
•
26
Can LLMs Learn by Teaching? A Preliminary Study
Paper
•
2406.14629
•
Published
•
20
MoA: Mixture of Sparse Attention for Automatic Large Language Model
Compression
Paper
•
2406.14909
•
Published
•
16
A Survey on Efficient Inference for Large Language Models
Paper
•
2404.14294
•
Published
•
3
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers
for Image and Video Generation
Paper
•
2406.02540
•
Published
•
3
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with
Metric-Decoupled Mixed Precision Quantization
Paper
•
2405.17873
•
Published
•
3
FrameFusion: Combining Similarity and Importance for Video Token
Reduction on Large Visual Language Models
Paper
•
2501.01986
•
Published
•
1
Evaluating Quantized Large Language Models
Paper
•
2402.18158
•
Published