Update README.md
Browse files
README.md
CHANGED
@@ -79,7 +79,11 @@ Please check [here](https://docs.vllm.ai/en/stable/models/engine_args.html) if y
|
|
79 |
If you would like to deploy your LoRA adapter, please refer to the [vLLM documentation](https://docs.vllm.ai/en/latest/usage/lora.html#serving-lora-adapters) for a detailed guide.<br>
|
80 |
It provides step-by-step instructions on how to serve LoRA adapters effectively in a vLLM environment.<br>
|
81 |
**We have also shared our trained LoRA adapter** [here](https://huggingface.co/shuyuej/Public-Shared-LoRA-for-Llama-3.3-70B-Instruct-GPTQ). Please download it manually if needed.
|
|
|
|
|
|
|
82 |
|
|
|
83 |
```shell
|
84 |
vllm serve shuyuej/Llama-3.3-70B-Instruct-GPTQ \
|
85 |
--quantization gptq \
|
@@ -90,7 +94,7 @@ vllm serve shuyuej/Llama-3.3-70B-Instruct-GPTQ \
|
|
90 |
--pipeline-parallel-size 4 \
|
91 |
--api-key token-abc123 \
|
92 |
--enable-lora \
|
93 |
-
--lora-modules adapter=checkpoint-18640
|
94 |
```
|
95 |
|
96 |
Since this server is compatible with OpenAI API, you can use it as a drop-in replacement for any applications using OpenAI API.
|
|
|
79 |
If you would like to deploy your LoRA adapter, please refer to the [vLLM documentation](https://docs.vllm.ai/en/latest/usage/lora.html#serving-lora-adapters) for a detailed guide.<br>
|
80 |
It provides step-by-step instructions on how to serve LoRA adapters effectively in a vLLM environment.<br>
|
81 |
**We have also shared our trained LoRA adapter** [here](https://huggingface.co/shuyuej/Public-Shared-LoRA-for-Llama-3.3-70B-Instruct-GPTQ). Please download it manually if needed.
|
82 |
+
```shell
|
83 |
+
git clone https://huggingface.co/shuyuej/Public-Shared-LoRA-for-Llama-3.3-70B-Instruct-GPTQ
|
84 |
+
```
|
85 |
|
86 |
+
Then, use the vLLM to serve the base model with the LoRA adapter by including the `--enable-lora` flag and specifying `--lora-modules`:
|
87 |
```shell
|
88 |
vllm serve shuyuej/Llama-3.3-70B-Instruct-GPTQ \
|
89 |
--quantization gptq \
|
|
|
94 |
--pipeline-parallel-size 4 \
|
95 |
--api-key token-abc123 \
|
96 |
--enable-lora \
|
97 |
+
--lora-modules adapter=Public-Shared-LoRA-for-Llama-3.3-70B-Instruct-GPTQ/checkpoint-18640
|
98 |
```
|
99 |
|
100 |
Since this server is compatible with OpenAI API, you can use it as a drop-in replacement for any applications using OpenAI API.
|