nytopop commited on
Commit
22c7d27
·
verified ·
1 Parent(s): 3c66597

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -6,6 +6,23 @@ pipeline_tag: text-generation
6
  base_model: Qwen/Qwen3-30B-A3B
7
  ---
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  # creation
10
 
11
  ```python
 
6
  base_model: Qwen/Qwen3-30B-A3B
7
  ---
8
 
9
+ # usage with sglang
10
+
11
+ Currently, upstream sglang doesn't load this quant correctly due to a few minor issues. Until upstream is fixed, a working fork is available at https://github.com/nytopop/sglang/tree/qwen-30b-a3b:
12
+
13
+ ```shell
14
+ uv venv --python 3.12
15
+
16
+ # vllm is needed to load w4a16 quant scheme
17
+ uv pip install vllm>=0.8.5
18
+
19
+ # use patched sglang from git
20
+ uv pip install git+https://github.com/nytopop/sglang.git@qwen-30b-a3b#subdirectory=python[all] --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
21
+
22
+ # run
23
+ uv run python -m sglang.launch_server --model-path nytopop/Qwen3-30B-A3B.w4a16 --reasoning-parser qwen3 --dtype float16
24
+ ```
25
+
26
  # creation
27
 
28
  ```python