there's an issue with the to() method in PyTorch, which is being passed a data type (float32) instead of a device (e.g., 'cpu' or 'cuda')

by dx111ge - opened Mar 12

Mar 12

File "C:\Users\svena.cache\huggingface\modules\transformers_modules\fixie-ai\ultravox-v0_5-llama-3_1-8b\779bcda5ad4b7ed18fd0a37f065a564ca18efa31\ultravox_model.py", line 313, in _create_multi_modal_projector
projector.to(config.torch_dtype)
File "C:\Users\svena\VSCodePython\Ultravox\TServer.venv\Lib\site-packages\torch\nn\modules\module.py", line 1302, in to
device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(
^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Invalid device string: 'float32'

farzadab

Mar 12

What's the code you're using to invoke the model?
torch_dtype shouldn't be a string. It should be torch.float32.

dx111ge

Mar 13

i just followed the instructions (using python 3.10 and cuda enabled -> pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124) on windows :
import transformers
import numpy as np
import librosa

pipe = transformers.pipeline(model='fixie-ai/ultravox-v0_5-llama-3_1-8b', trust_remote_code=True)

path = "" # TODO: pass the audio here
audio, sr = librosa.load(path, sr=16000)

turns = [
{
"role": "system",
"content": "You are a friendly and helpful character. You love to answer questions for people."
},
]
pipe({'audio': audio, 'turns': turns, 'sampling_rate': sr}, max_new_tokens=30)

dx111ge

Mar 13

okay, add an import torch and :
pipe = transformers.pipeline(
model='fixie-ai/ultravox-v0_5-llama-3_2-1b',
torch_dtype=torch.float32, # Explicit dtype specification
device=0 if torch.cuda.is_available() else -1, # 0 = first GPU
trust_remote_code=True
) and it works

farzadab

Mar 13

Hmm, the pipeline should work out of the box without specifying dtype. You're right, that's a bug. I'll take a look.

btw bfloat16 is recommended if your hardware supports it since it takes less space without loss of performance (all of our training and benchmarks are in bfloat16 already).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment