
Snowflake/Llama-3.1-SwiftKV-8B-Instruct
8B
•
Updated
•
67.4k
•
7
SwiftKV reduces prefill compute by up to 50% by combining model rewiring and knowledge-preserving self-distillation.