DFlash
Collection
Block Diffusion for Flash Speculative Decoding
•
6 items
•
Updated
•
18
Efficient AI
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference