Update README.md
Browse files
README.md
CHANGED
@@ -245,6 +245,8 @@ model = SALM.from_pretrained('nvidia/canary-qwen-2.5b')
|
|
245 |
|
246 |
Input to Canary-Qwen-2.5B is a batch of prompts that include audio.
|
247 |
|
|
|
|
|
248 |
```python
|
249 |
answer_ids = model.generate(
|
250 |
prompts=[
|
@@ -255,6 +257,18 @@ answer_ids = model.generate(
|
|
255 |
print(model.tokenizer.ids_to_text(answer_ids[0].cpu()))
|
256 |
```
|
257 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
258 |
To transcribe a dataset of recordings, specify the input as jsonl manifest file, where each line in the file is a dictionary containing the following fields:
|
259 |
|
260 |
```yaml
|
|
|
245 |
|
246 |
Input to Canary-Qwen-2.5B is a batch of prompts that include audio.
|
247 |
|
248 |
+
Example usage in ASR mode (speech-to-text):
|
249 |
+
|
250 |
```python
|
251 |
answer_ids = model.generate(
|
252 |
prompts=[
|
|
|
257 |
print(model.tokenizer.ids_to_text(answer_ids[0].cpu()))
|
258 |
```
|
259 |
|
260 |
+
Example usage in LLM mode (text-only):
|
261 |
+
|
262 |
+
```python
|
263 |
+
prompt = "..."
|
264 |
+
transcript = "..."
|
265 |
+
with model.llm.disable_adapter():
|
266 |
+
answer_ids = model.generate(
|
267 |
+
prompts=[[{"role": "user", "content": f"{prompt}\n\n{transcript}"}]],
|
268 |
+
max_new_tokens=2048,
|
269 |
+
)
|
270 |
+
```
|
271 |
+
|
272 |
To transcribe a dataset of recordings, specify the input as jsonl manifest file, where each line in the file is a dictionary containing the following fields:
|
273 |
|
274 |
```yaml
|