Is it worth hosting a quantized DeepSeek V3? Cost & performance insights?

#15
by Techw - opened

Hi everyone,

I’m fairly new to working with quantized models and exploring whether it's worth self-hosting a quantized version of DeepSeek V3 on SageMaker while maintaining good quality results.

I’d appreciate insights from those with experience:

  1. Is it worth exploring this approach in terms of both quality and cost?
  2. What kind of computing resources (e.g., instance types, memory, GPU/CPU) would I need?
  3. Any rough cost estimates for running it effectively?

Any recommendations or shared experiences would be really helpful. Thanks in advance!

Sign up or log in to comment