Update README.md
Browse files
README.md
CHANGED
@@ -489,10 +489,22 @@ response = processor.batch_decode(generated_ids, skip_special_tokens=True)
|
|
489 |
|
490 |
MERaLiON-AudioLLM requires vLLM version `0.6.4.post1`.
|
491 |
|
492 |
-
```
|
493 |
pip install vllm==0.6.4.post1
|
494 |
```
|
495 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
496 |
Here is an example of offline inference using our custom vLLM class.
|
497 |
|
498 |
```python
|
@@ -500,10 +512,6 @@ import torch
|
|
500 |
from vllm import ModelRegistry, LLM, SamplingParams
|
501 |
from vllm.assets.audio import AudioAsset
|
502 |
|
503 |
-
# register custom MERaLiON-AudioLLM class
|
504 |
-
from .vllm_meralion import MERaLiONForConditionalGeneration
|
505 |
-
ModelRegistry.register_model("MERaLiONForConditionalGeneration", MERaLiONForConditionalGeneration)
|
506 |
-
|
507 |
def run_meralion(question: str):
|
508 |
model_name = "MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION"
|
509 |
|
@@ -549,6 +557,23 @@ for o in outputs:
|
|
549 |
print(generated_text)
|
550 |
```
|
551 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
552 |
## Disclaimer
|
553 |
|
554 |
The current MERaLiON-AudioLLM has not been specifically aligned for safety and may generate content that is inappropriate, offensive, or harmful. Developers and users are responsible for performing their own safety fine-tuning and implementing necessary security measures. The authors shall not be held liable for any claims, damages, or other liabilities arising from the use of the released models, weights, or code.
|
|
|
489 |
|
490 |
MERaLiON-AudioLLM requires vLLM version `0.6.4.post1`.
|
491 |
|
492 |
+
```bash
|
493 |
pip install vllm==0.6.4.post1
|
494 |
```
|
495 |
|
496 |
+
#### Model Registration
|
497 |
+
|
498 |
+
As the [vLLM documentation](https://docs.vllm.ai/en/stable/models/adding_model.html#out-of-tree-model-integration) recommends,
|
499 |
+
we provide a way to register our model via [vLLM plugins](https://docs.vllm.ai/en/stable/design/plugin_system.html#plugin-system).
|
500 |
+
|
501 |
+
```bash
|
502 |
+
cd vllm_plugin_meralion
|
503 |
+
python install .
|
504 |
+
```
|
505 |
+
|
506 |
+
#### vLLM Offline Inference
|
507 |
+
|
508 |
Here is an example of offline inference using our custom vLLM class.
|
509 |
|
510 |
```python
|
|
|
512 |
from vllm import ModelRegistry, LLM, SamplingParams
|
513 |
from vllm.assets.audio import AudioAsset
|
514 |
|
|
|
|
|
|
|
|
|
515 |
def run_meralion(question: str):
|
516 |
model_name = "MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION"
|
517 |
|
|
|
557 |
print(generated_text)
|
558 |
```
|
559 |
|
560 |
+
#### OpenAI Compatible Server
|
561 |
+
|
562 |
+
|
563 |
+
**server**
|
564 |
+
|
565 |
+
Here is an example to start the server via the `vllm serve` command.
|
566 |
+
|
567 |
+
```bash
|
568 |
+
export HF_TOKEN=your-hf-token
|
569 |
+
|
570 |
+
vllm serve MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION --tokenizer MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION --tokenizer-mode slow --max-model-len 1536 --max-num-seqs 8 --trust-remote-code --dtype bfloat16
|
571 |
+
```
|
572 |
+
|
573 |
+
**client**
|
574 |
+
|
575 |
+
Refer to official vLLM example [code](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_completion_client_for_multimodal.py#L213-L236).
|
576 |
+
|
577 |
## Disclaimer
|
578 |
|
579 |
The current MERaLiON-AudioLLM has not been specifically aligned for safety and may generate content that is inappropriate, offensive, or harmful. Developers and users are responsible for performing their own safety fine-tuning and implementing necessary security measures. The authors shall not be held liable for any claims, damages, or other liabilities arising from the use of the released models, weights, or code.
|