vilsonrodrigues
/

OpenHermes-2.5-Mistral-7B-Pruned50-GPTQ-Marlin

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

A Mistral-7B pruned50 with Marlin Kernel and AutoGPTQ

Please see my tutorial to execute this model:

https://vilsonrodrigues.medium.com/sparse-quantize-and-serving-llms-with-neuralmagic-autogptq-and-vllm-03961b72ec3a

Downloads last month: 7

Safetensors

Model size

1.14B params

Tensor type

I32

·

FP16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support