updated readme with steps on how to run the model
Browse files
README.md
CHANGED
@@ -38,17 +38,51 @@ This repo contains 4-bit quantized (using ExLlamaV2) model of Meta's meta-llama/
|
|
38 |
|
39 |
Use the code below to get started with the model.
|
40 |
|
|
|
41 |
## How to run from Python code
|
42 |
|
43 |
#### First install the package
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
#### Import
|
46 |
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
## Uses
|
54 |
|
|
|
38 |
|
39 |
Use the code below to get started with the model.
|
40 |
|
41 |
+
|
42 |
## How to run from Python code
|
43 |
|
44 |
#### First install the package
|
45 |
+
```shell
|
46 |
+
# Install ExLLamaV2
|
47 |
+
!git clone https://github.com/turboderp/exllamav2
|
48 |
+
!pip install -e exllamav2
|
49 |
+
```
|
50 |
|
51 |
#### Import
|
52 |
|
53 |
+
```python
|
54 |
+
from huggingface_hub import login, HfApi, create_repo
|
55 |
+
from torch import bfloat16
|
56 |
+
import locale
|
57 |
+
import torch
|
58 |
+
import os
|
59 |
+
```
|
60 |
+
|
61 |
+
#### set up variables
|
62 |
+
|
63 |
+
```python
|
64 |
+
# Define the model ID for the desired model
|
65 |
+
model_id = "alokabhishek/Llama-2-7b-chat-hf-5.0-bpw-exl2"
|
66 |
+
BPW = 5.0
|
67 |
+
|
68 |
+
# define variables
|
69 |
+
model_name = model_id.split("/")[-1]
|
70 |
+
quant_name = model_id.split("/")[-1] + f"-{BPW:.1f}-bpw-exl2"
|
71 |
+
|
72 |
+
```
|
73 |
+
|
74 |
+
#### Download the quantized model
|
75 |
+
```shell
|
76 |
+
!git-lfs install
|
77 |
+
# download the model to loacl directory
|
78 |
+
!git clone https://{username}:{HF_TOKEN}@huggingface.co/{model_id} {quant_name}
|
79 |
+
```
|
80 |
+
|
81 |
+
#### Run Inference on quantized model using
|
82 |
+
```shell
|
83 |
+
# Run model
|
84 |
+
!python exllamav2/test_inference.py -m {quant_name}/ -p "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar."
|
85 |
+
```
|
86 |
|
87 |
## Uses
|
88 |
|