prayanksai
/

gpt-oss-120b-MLX-6bit

Text Generation

Model card Files Files and versions

gpt-oss-120b-MLX-6bit / README.md

prayanksai's picture

Update README.md

6a22b1e verified 7 days ago

|

history blame contribute delete

830 Bytes

	---
	license: apache-2.0
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- vllm
	- mlx
	- quantized
	base_model: openai/gpt-oss-120b
	---

	# gpt-oss-120B (6‑bit quantized via MLX‑LM)

	A 6‑bit quantized version of `openai/gpt-oss-120b` created using MLX‑LM.
	This version significantly reduces inference memory requirements (~90 GB RAM), while retaining most of the model’s original capabilities.

	⸻

	🛠️ Quantization Process

	The model was created using the following steps:

	* pip uninstall mlx-lm
	* pip install git+https://github.com/ml-explore/mlx-lm.git@main
	* mlx_lm.convert \
	--hf-path openai/gpt-oss-120b \
	--quantize \
	--q-bits 6 \
	--output-dir gpt-oss-120b-MLX-6bit

	These commands use the latest MLX‑LM converter to apply a consistent 6‑bit quantization across model weights.