Kimi K2: Open Agentic Intelligence

Overview

Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. It excels in frontier knowledge, reasoning, and coding tasks, optimized for agentic capabilities.

Key Features

Large-Scale Training: Trained on 15.5T tokens with zero instability.
MuonClip Optimizer: Novel optimization for stable scaling.
Agentic Intelligence: Designed for tool use, reasoning, and autonomous problem-solving.

Model Variants

Kimi-K2-Base: Foundation model for fine-tuning.
Kimi-K2-Instruct: Post-trained for general-purpose chat and agentic tasks.

Technical Specifications

Architecture: Mixture-of-Experts (MoE)
Total Parameters: 1T
Activated Parameters: 32B
Layers: 61 (including 1 dense layer)
Attention Hidden Dimension: 7168
MoE Hidden Dimension: 2048 (per expert)
Attention Heads: 64
Experts: 384
Selected Experts per Token: 8
Vocabulary Size: 160K
Context Length: 128K
Attention Mechanism: MLA
Activation Function: SwiGLU

Evaluation Highlights

Kimi K2 outperforms competitors in various benchmarks:

Coding: LiveCodeBench (53.7% Pass@1), SWE-bench Verified (71.6% Multiple Attempts Acc)
Tool Use: Tau2 retail (70.6% Avg@4), AceBench (76.5% Acc)
Math & STEM: AIME 2024 (69.6% Avg@64), MATH-500 (97.4% Acc)
General Tasks: MMLU (89.5% EM), Livebench (76.4% Pass@1)

Deployment

API: Available on Moonshot AI platform (OpenAI/Anthropic-compatible).
Inference Engines: vLLM, SGLang, KTransformers, TensorRT-LLM.
Model Format: Block-fp8 on Hugging Face.

Usage Examples

Chat Completion

def simple_chat(client, model_name):
    messages = [
        {"role": "system", "content": "You are Kimi..."},
        {"role": "user", "content": [{"type": "text", "text": "Self-intro"}]}
    ]
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=0.6,
        max_tokens=256
    )
    print(response.choices[0].message.content)

Tool Calling

def tool_call_with_client(client, model_name):
    # Define tools and tool_map
    messages = [...]
    while finish_reason in [None, "tool_calls"]:
        completion = client.chat.completions.create(...)
        # Process tool calls and append results to messages

License

Modified MIT License.

Contact

[email protected]

moonshotai
/

Kimi-K2-Instruct

Kimi K2: 1T-Param MoE Model for Agentic AI