https://colab.research.google.com/drive/1-xZmBRXT5Fm3Ghn4Mwa2KRypORXb855X?usp=sharing
AQLM method has been recently introduced on transformers main branch
The 2bit model can be found here: BlackSamorez/Mixtral-8x7b-AQLM-2Bit-1x16-hf-test-dispatch
And you can read more about the method here: https://huggingface.co/docs/transformers/main/en/quantization#aqlm
Great work @BlackSamorez and team!