5090?

#4
by PovGeek - opened

Any chance to actually get these workflows to work on modern GPUs like a 5090? I've wasted days trying to get it to work and am really frustrated.

it only took me half a day bro, using lmdeploy is faster than vllm and it worked

sglang works too

please guide me i did it with sglang but the result is not correct. i think i have to adjust the chat template but after n times still failed the result is not the same as vllm did.

Sign up or log in to comment