MERaLiON
/

MERaLiON-AudioLLM-Whisper-SEA-LION

Automatic Speech Recognition

Model card Files Files and versions Community

YingxuHe commited on Dec 26, 2024

Commit

220b3d4

·

verified ·

1 Parent(s): 55377c0

Update README.md

Files changed (1) hide show

README.md +30 -5

README.md CHANGED Viewed

@@ -489,10 +489,22 @@ response = processor.batch_decode(generated_ids, skip_special_tokens=True)
 MERaLiON-AudioLLM requires vLLM version `0.6.4.post1`.
-```
 pip install vllm==0.6.4.post1
 ```
 Here is an example of offline inference using our custom vLLM class.
 ```python
@@ -500,10 +512,6 @@ import torch
 from vllm import ModelRegistry, LLM, SamplingParams
 from vllm.assets.audio import AudioAsset
-# register custom MERaLiON-AudioLLM class
-from .vllm_meralion import MERaLiONForConditionalGeneration
-ModelRegistry.register_model("MERaLiONForConditionalGeneration", MERaLiONForConditionalGeneration)
 def run_meralion(question: str):
     model_name = "MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION"
@@ -549,6 +557,23 @@ for o in outputs:
     print(generated_text)
 ```
 ## Disclaimer
 The current MERaLiON-AudioLLM has not been specifically aligned for safety and may generate content that is inappropriate, offensive, or harmful. Developers and users are responsible for performing their own safety fine-tuning and implementing necessary security measures. The authors shall not be held liable for any claims, damages, or other liabilities arising from the use of the released models, weights, or code.

 MERaLiON-AudioLLM requires vLLM version `0.6.4.post1`.
+```bash
 pip install vllm==0.6.4.post1
 ```
+#### Model Registration
+As the [vLLM documentation](https://docs.vllm.ai/en/stable/models/adding_model.html#out-of-tree-model-integration) recommends,
+we provide a way to register our model via [vLLM plugins](https://docs.vllm.ai/en/stable/design/plugin_system.html#plugin-system).
+```bash
+cd vllm_plugin_meralion
+python install .
+```
+#### vLLM Offline Inference
 Here is an example of offline inference using our custom vLLM class.
 ```python
 from vllm import ModelRegistry, LLM, SamplingParams
 from vllm.assets.audio import AudioAsset
 def run_meralion(question: str):
     model_name = "MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION"
     print(generated_text)
 ```
+#### OpenAI Compatible Server
+**server**
+Here is an example to start the server via the `vllm serve` command.
+```bash
+export HF_TOKEN=your-hf-token
+vllm serve MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION --tokenizer MERaLiON/MERaLiON-AudioLLM-Whisper-SEA-LION --tokenizer-mode slow --max-model-len 1536 --max-num-seqs 8 --trust-remote-code --dtype bfloat16
+```
+**client**
+Refer to official vLLM example [code](https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_completion_client_for_multimodal.py#L213-L236).
 ## Disclaimer
 The current MERaLiON-AudioLLM has not been specifically aligned for safety and may generate content that is inappropriate, offensive, or harmful. Developers and users are responsible for performing their own safety fine-tuning and implementing necessary security measures. The authors shall not be held liable for any claims, damages, or other liabilities arising from the use of the released models, weights, or code.