Run in 2bit

#23
by LLMToaster - opened

Hi there, is it possible to run model in exactly turnary bits (1.58b)? If so, how, model is really small and efficient but on huggingface it is 16bit. Can this be done on CPU or must it be done on GPU? Exactly, natively not transformed to any other bit size, is it possible?

It is supported. Please refer to this github repo for CPU inference of this model https://github.com/microsoft/BitNet

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment