fsaudm
/

Meta-Llama-3.1-8B-Instruct-NF4

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Meta-Llama-3.1-8B-Instruct-NF4 / README.md

fsaudm's picture

Update README.md

9a6dece verified about 16 hours ago

|

history blame contribute delete

No virus

1.36 kB

	---
	base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
	language:
	- en
	- de
	- fr
	- it
	- pt
	- hi
	- es
	- th
	library_name: transformers
	license: llama3.1
	tags:
	- facebook
	- meta
	- pytorch
	- llama
	- llama-3
	model-index:
	- name: Meta-Llama-3.1-8B-Instruct-BF4
	results: []
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This is a quantized version of `Llama 3.1 8B Instruct`. Quantized to 4-bit using `bistandbytes` and `accelerate`.

	- Developed by: Farid Saud @ DSRS
	- License: llama3.1
	- Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct

	## Use this model


	Use a pipeline as a high-level helper:
	```python
	# Use a pipeline as a high-level helper
	from transformers import pipeline

	messages = [
	{"role": "user", "content": "Who are you?"},
	]
	pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
	pipe(messages)
	```



	Load model directly
	```python
	# Load model directly
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
	model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-8B-Instruct-NF4")
	```

	The base model information can be found in the original [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)