OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic

Method

Quantised using vllm-project/llm-compressor and the following configs:

recipe = [
    SmoothQuantModifier(smoothing_strength=0.8),
    GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"]),
]
Downloads last month
2
Safetensors
Model size
32.8B params
Tensor type
BF16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cpatonn/OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic

Base model

Qwen/Qwen2.5-32B
Quantized
(20)
this model

Dataset used to train cpatonn/OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic