YAQA YAQA hessians (Sketch B) and models with the QTIP quantizer. See https://github.com/Cornell-RelaxML/yaqa/tree/main for more details. relaxml/Llama-3.1-70B-Instruct-2Bit-YAQA-QTIP 11B • Updated May 22 • 45 relaxml/Llama-3.1-70B-Instruct-3Bit-YAQA-QTIP 15B • Updated May 22 • 3 relaxml/Llama-3.1-70B-Instruct-4Bit-YAQA-QTIP 19B • Updated May 22 • 3 relaxml/Llama-3.1-70B-Instruct-Hessians-2Sided Updated May 19
QTIP Quantized Models See https://github.com/Cornell-RelaxML/qtip relaxml/Llama-2-7b-QTIP-4Bit 2B • Updated Oct 28, 2024 • 17 • 2 relaxml/Llama-2-7b-QTIP-3Bit 1B • Updated Oct 28, 2024 • 69 • 1 relaxml/Llama-2-7b-chat-QTIP-3Bit 1B • Updated Oct 28, 2024 • 3 relaxml/Llama-2-13b-QTIP-3Bit 3B • Updated Oct 28, 2024 • 84
QuIP# Quantized Models https://github.com/Cornell-RelaxML/quip-sharp relaxml/Llama-2-7b-chat-E8PRVQ-4Bit Text Generation • 0.7B • Updated Feb 8, 2024 • 23 relaxml/Llama-2-70b-E8PRVQ-4Bit Text Generation • 5B • Updated Feb 8, 2024 • 9 relaxml/Llama-2-13b-E8PRVQ-4Bit Text Generation • 1B • Updated Feb 8, 2024 • 3 relaxml/Llama-2-7b-E8PRVQ-4Bit Text Generation • 0.7B • Updated Feb 8, 2024 • 9
YAQA YAQA hessians (Sketch B) and models with the QTIP quantizer. See https://github.com/Cornell-RelaxML/yaqa/tree/main for more details. relaxml/Llama-3.1-70B-Instruct-2Bit-YAQA-QTIP 11B • Updated May 22 • 45 relaxml/Llama-3.1-70B-Instruct-3Bit-YAQA-QTIP 15B • Updated May 22 • 3 relaxml/Llama-3.1-70B-Instruct-4Bit-YAQA-QTIP 19B • Updated May 22 • 3 relaxml/Llama-3.1-70B-Instruct-Hessians-2Sided Updated May 19
QuIP# Quantized Models https://github.com/Cornell-RelaxML/quip-sharp relaxml/Llama-2-7b-chat-E8PRVQ-4Bit Text Generation • 0.7B • Updated Feb 8, 2024 • 23 relaxml/Llama-2-70b-E8PRVQ-4Bit Text Generation • 5B • Updated Feb 8, 2024 • 9 relaxml/Llama-2-13b-E8PRVQ-4Bit Text Generation • 1B • Updated Feb 8, 2024 • 3 relaxml/Llama-2-7b-E8PRVQ-4Bit Text Generation • 0.7B • Updated Feb 8, 2024 • 9
QTIP Quantized Models See https://github.com/Cornell-RelaxML/qtip relaxml/Llama-2-7b-QTIP-4Bit 2B • Updated Oct 28, 2024 • 17 • 2 relaxml/Llama-2-7b-QTIP-3Bit 1B • Updated Oct 28, 2024 • 69 • 1 relaxml/Llama-2-7b-chat-QTIP-3Bit 1B • Updated Oct 28, 2024 • 3 relaxml/Llama-2-13b-QTIP-3Bit 3B • Updated Oct 28, 2024 • 84