--- library_name: transformers base_model: meta-llama/Meta-Llama-3.1-8B-Instruct license: llama3.1 model-index: - name: Meta-Llama-3.1-8B-Instruct-INT4 results: [] language: - en - de - fr - it - pt - hi - es - th tags: - facebook - meta - pytorch - llama - llama-3 --- # Model Card for Model ID This is a quantized version of `Llama 3.1 70B Instruct`. Quantization to **4-bit** using `bistandbytes` and `accelerate`. - **Developed by:** [More Information Needed] - **License:** llama3.1 - **Base Model [optional]:** meta-llama/Meta-Llama-3.1-8B-Instruct ```python # Use a pipeline as a high-level helper from transformers import pipeline messages = [ {"role": "user", "content": "Who are you?"}, ] pipe = pipeline("text-generation", model="meta-llama/Meta-Llama-3.1-8B-Instruct") pipe(messages) Copy # Load model directly ``` ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") ``` The model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)