kaitchup
/

Llama-2-7b-gptq-4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Llama-2-7b-gptq-4bit / README.md

bnjmnmarie's picture

Update README.md

5332cea almost 2 years ago

|

history blame contribute delete

1.17 kB

metadata

license: apache-2.0
language:
  - en

Model Card for Model ID

This is Meta's Llama 2 7B quantized in 4-bit using AutoGPTQ from Hugging Face Transformers.

Model Details

Model Description

Developed by: The Kaitchup
Model type: Causal (Llama 2)
Language(s) (NLP): English
License: Apache 2.0, Llama 2 license agreement

Model Sources

The method and code used to quantize the model are explained here: Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL

Uses

This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters.

Other quantized versions

Model Card Contact