它是否正常工作,或是我的情况是否正常

by qianchenccc - opened Apr 19

Apr 19

当我部署QwQ-32B时,占用了显存19G,但是当我切换到这个模型,占用的显存仍然不变.

Open Platform for Enterprise AI org Apr 21

这个看你用的部署框架和策略。如果是transformers,所有layer都放到cuda,原始模型32B，显存应该是64B左右。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment