prayanksai
/

gpt-oss-120b-MLX-6bit

Text Generation

Model card Files Files and versions

prayanksai commited on 8 days ago

Commit

fe7ac02

·

verified ·

1 Parent(s): af2e854

Update README.md

Files changed (1) hide show

README.md +26 -1

README.md CHANGED Viewed

@@ -1,9 +1,34 @@
 ---
 license: apache-2.0
 pipeline_tag: text-generation
-library_name: mlx
 tags:
 - vllm
 - mlx
 base_model: openai/gpt-oss-120b
 ---

+Here’s an improved, well‑structured README.md / model card optimized for Hugging Face:
 ---
 license: apache-2.0
 pipeline_tag: text-generation
+library_name: transformers
 tags:
 - vllm
 - mlx
+- quantized
 base_model: openai/gpt-oss-120b
 ---
+# gpt-oss-120B (6‑bit quantized via MLX‑LM)
+**A 6‑bit quantized version of `openai/gpt-oss-120b` created using MLX‑LM.**
+This version significantly reduces inference memory requirements (~90 GB RAM), while retaining most of the model’s original capabilities.
+⸻
+🛠️ Quantization Process
+The model was created using the following steps:
+pip uninstall mlx-lm
+pip install git+https://github.com/ml-explore/mlx-lm.git@main
+mlx_lm.convert \
+  --hf-path openai/gpt-oss-120b \
+  --quantize \
+  --q-bits 6 \
+  --output-dir gpt-oss-120b-MLX-6bit
+These commands use the latest MLX‑LM converter to apply a consistent 6‑bit quantization across model weights.