fsaudm
/

Meta-Llama-3.1-8B-Instruct-NF4

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

fsaudm commited on Aug 21, 2024

Commit

f6045d7

·

verified ·

1 Parent(s): 0882f26

Update README.md

Files changed (1) hide show

README.md +12 -4

README.md CHANGED Viewed

@@ -26,12 +26,16 @@ tags:
 <!-- Provide a quick summary of what the model is/does. -->
-This is a quantized version of `Llama 3.1 70B Instruct`. Quantization to **4-bit** using `bistandbytes` and `accelerate`.
-- **Developed by:** [More Information Needed]
 - **License:** llama3.1
 - **Base Model [optional]:** meta-llama/Meta-Llama-3.1-8B-Instruct
 ```python
 # Use a pipeline as a high-level helper
 from transformers import pipeline
@@ -40,17 +44,21 @@ messages = [
     {"role": "user", "content": "Who are you?"},
 ]
 pipe = pipeline("text-generation", model="meta-llama/Meta-Llama-3.1-8B-Instruct")
-pipe(messages)   Copy  # Load model directly
 ```
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
 model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
 ```
-The model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)

 <!-- Provide a quick summary of what the model is/does. -->
+This is a quantized version of `Llama 3.1 8B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`.
+- **Developed by:** Farid Saud @ DSRS
 - **License:** llama3.1
 - **Base Model [optional]:** meta-llama/Meta-Llama-3.1-8B-Instruct
+## Use this model
+Use a pipeline as a high-level helper:
 ```python
 # Use a pipeline as a high-level helper
 from transformers import pipeline
     {"role": "user", "content": "Who are you?"},
 ]
 pipe = pipeline("text-generation", model="meta-llama/Meta-Llama-3.1-8B-Instruct")
+pipe(messages)
 ```
+Load model directly
 ```python
+# Load model directly
 from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
 model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
 ```
+The base model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)