Qwen
/

Text Generation
Transformers
Safetensors
qwen3_moe
conversational
hzhwcmhf commited on
Commit
a96cc9c
·
verified ·
1 Parent(s): 6ff27a5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -82,21 +82,23 @@ print("thinking content:", thinking_content)
82
  print("content:", content)
83
  ```
84
 
85
- For deployment, you can use `vllm>=0.8.5` or `sglang>=0.4.5.post2` to create an OpenAI-compatible API endpoint:
86
- - vLLM:
87
  ```shell
88
- vllm serve Qwen/Qwen3-235B-A22B --enable-reasoning --reasoning-parser deepseek_r1
89
  ```
90
- - SGLang:
91
  ```shell
92
- python -m sglang.launch_server --model-path Qwen/Qwen3-235B-A22B --reasoning-parser deepseek-r1
93
  ```
94
 
 
 
95
  ## Switching Between Thinking and Non-Thinking Mode
96
 
97
  > [!TIP]
98
- > The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
99
- > Please refer to our documentation for [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) and [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) users.
100
 
101
  ### `enable_thinking=True`
102
 
 
82
  print("content:", content)
83
  ```
84
 
85
+ For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.4` or to create an OpenAI-compatible API endpoint:
86
+ - SGLang:
87
  ```shell
88
+ python -m sglang.launch_server --model-path Qwen/Qwen3-235B-A22B --reasoning-parser qwen3
89
  ```
90
+ - vLLM:
91
  ```shell
92
+ vllm serve Qwen/Qwen3-235B-A22B --enable-reasoning --reasoning-parser deepseek_r1
93
  ```
94
 
95
+ For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
96
+
97
  ## Switching Between Thinking and Non-Thinking Mode
98
 
99
  > [!TIP]
100
+ > The `enable_thinking` switch is also available in APIs created by SGLang and vLLM.
101
+ > Please refer to our documentation for [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) and [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) users.
102
 
103
  ### `enable_thinking=True`
104