--- library_name: transformers base_model: meta-llama/Meta-Llama-3.1-8B-Instruct license: llama3.1 model-index: - name: Meta-Llama-3.1-8B-Instruct-INT4 results: [] language: - en - de - fr - it - pt - hi - es - th tags: - facebook - meta - pytorch - llama - llama-3 --- # Model Card for Model ID This is a quantized version of `Llama 3.1 8B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`. - **Developed by:** Farid Saud @ DSRS - **License:** llama3.1 - **Base Model [optional]:** meta-llama/Meta-Llama-3.1-8B-Instruct ## Use this model Use a pipeline as a high-level helper: ```python # Use a pipeline as a high-level helper from transformers import pipeline messages = [ {"role": "user", "content": "Who are you?"}, ] pipe = pipeline("text-generation", model="meta-llama/Meta-Llama-3.1-8B-Instruct") pipe(messages) ``` Load model directly ```python # Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") ``` The base model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)