ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v1 Text Generation • 7B • Updated Dec 18, 2024 • 9 • 51
view article Article Making LLMs lighter with AutoGPTQ and transformers By marcsun13 and 5 others • Aug 23, 2023 • 58
view article Article TGI Multi-LoRA: Deploy Once, Serve 30 Models By derek-thomas and 2 others • Jul 18, 2024 • 60