Can you provide Machine Specs

#2
by kingabzpro - opened

How many H100s are required to run this model locally and other parameters for hardware optimization.

From the deployment guide:

The smallest deployment unit for Kimi-K2 FP8 weights with 128k seqlen on mainstream H200 or H20 platform is a cluster with 16 GPUs with either Tensor Parallel (TP) or "data parallel + expert parallel" (DP+EP).

https://github.com/MoonshotAI/Kimi-K2/blob/main/docs/deploy_guidance.md

Moonshot AI org

The number of H100s needed at least is 16 with very short sequence length (only for simple testing). For a normal experience, 32 H100s are required.

Sign up or log in to comment