smaller dense versions?

by TheBigBlockPC - opened 23 days ago

23 days ago

are smaller distills planned of this model that are dense because of bitsandbytes quantisation not really working on MoE models. a 40b dense version would be useful for local deployment

kingriel

19 days ago

if you look at videos on yt the model isnt even that good, for 80b seems quite bad

TheBigBlockPC

18 days ago

How does it compare to qwen-image

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment