RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published May 5, 2025 • 28
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8, 2025 • 287
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16, 2024 • 43