Parkerlambert123 commited on
Commit
aa1c5ef
·
verified ·
1 Parent(s): 5dac675

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -2
README.md CHANGED
@@ -116,10 +116,29 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
116
  print(response)
117
  ```
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
  ### vllm
 
120
  For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
121
 
122
- ```python
123
  # install vllm
124
  pip install vllm>=0.6.4.post1
125
 
@@ -144,7 +163,8 @@ curl http://localhost:8000/v1/completions \
144
  ### SGLang
145
 
146
  You can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)
147
- ```python
 
148
  # install SGLang
149
  pip install "sglang[all]>=0.4.5" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
150
 
@@ -169,11 +189,15 @@ curl http://localhost:8000/v1/completions \
169
  ### ollama
170
 
171
  You can download ollama using [this](https://ollama.com/download/)
 
172
  * quantization: Q4_K_M
 
173
  ```bash
174
  ollama run zhihu/zhi-writing-dsr1-14b
175
  ```
 
176
  * bf16
 
177
  ```bash
178
  ollama run zhihu/zhi-writing-dsr1-14b:bf16
179
  ```
 
116
  print(response)
117
  ```
118
 
119
+ ### ZhiLight
120
+
121
+ You can easily start a service using [ZhiLight](https://github.com/zhihu/ZhiLight)
122
+
123
+ ```bash
124
+ docker run -it --net=host --gpus='"device=0"' -v /path/to/model:/mnt/models --entrypoints="" ghcr.io/zhihu/zhilight/zhilight:0.4.17-cu124 python -m zhilight.server.openai.entrypoints.api_server --model-path /mnt/models --port 8000 --enable-reasoning --reasoning-parser deepseek-r1 --served-model-name Zhi-writing-dsr1-14b
125
+
126
+ curl http://localhost:8000/v1/completions \
127
+ -H "Content-Type: application/json" \
128
+ -d '{
129
+ "model": "Zhi-writing-dsr1-14b",
130
+ "prompt": "请你以鲁迅的口吻,写一篇介绍西湖醋鱼的文章",
131
+ "max_tokens": 4096,
132
+ "temperature": 0.6,
133
+ "top_p": 0.95
134
+ }'
135
+ ```
136
+
137
  ### vllm
138
+
139
  For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
140
 
141
+ ```bash
142
  # install vllm
143
  pip install vllm>=0.6.4.post1
144
 
 
163
  ### SGLang
164
 
165
  You can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)
166
+
167
+ ```bash
168
  # install SGLang
169
  pip install "sglang[all]>=0.4.5" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
170
 
 
189
  ### ollama
190
 
191
  You can download ollama using [this](https://ollama.com/download/)
192
+
193
  * quantization: Q4_K_M
194
+
195
  ```bash
196
  ollama run zhihu/zhi-writing-dsr1-14b
197
  ```
198
+
199
  * bf16
200
+
201
  ```bash
202
  ollama run zhihu/zhi-writing-dsr1-14b:bf16
203
  ```