prayanksai commited on
Commit
fe7ac02
·
verified ·
1 Parent(s): af2e854

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -1,9 +1,34 @@
 
 
1
  ---
2
  license: apache-2.0
3
  pipeline_tag: text-generation
4
- library_name: mlx
5
  tags:
6
  - vllm
7
  - mlx
 
8
  base_model: openai/gpt-oss-120b
9
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Here’s an improved, well‑structured README.md / model card optimized for Hugging Face:
2
+
3
  ---
4
  license: apache-2.0
5
  pipeline_tag: text-generation
6
+ library_name: transformers
7
  tags:
8
  - vllm
9
  - mlx
10
+ - quantized
11
  base_model: openai/gpt-oss-120b
12
  ---
13
+
14
+ # gpt-oss-120B (6‑bit quantized via MLX‑LM)
15
+
16
+ **A 6‑bit quantized version of `openai/gpt-oss-120b` created using MLX‑LM.**
17
+ This version significantly reduces inference memory requirements (~90 GB RAM), while retaining most of the model’s original capabilities.
18
+
19
+
20
+
21
+ 🛠️ Quantization Process
22
+
23
+ The model was created using the following steps:
24
+
25
+ pip uninstall mlx-lm
26
+ pip install git+https://github.com/ml-explore/mlx-lm.git@main
27
+
28
+ mlx_lm.convert \
29
+ --hf-path openai/gpt-oss-120b \
30
+ --quantize \
31
+ --q-bits 6 \
32
+ --output-dir gpt-oss-120b-MLX-6bit
33
+
34
+ These commands use the latest MLX‑LM converter to apply a consistent 6‑bit quantization across model weights.