BAAI
/

AquilaChat2-7B-16K

Text Generation

Model card Files Files and versions Community

Anhforth commited on Oct 20, 2023

Commit

f4c8499

•

1 Parent(s): 7f2a151

Update README.md

Files changed (1) hide show

README.md +12 -2

README.md CHANGED Viewed

@@ -23,12 +23,22 @@ The additional details of the Aquila model will be presented in the official tec
 ### 1. Inference
 ```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 device = torch.device("cuda:0")
 model_info = "BAAI/AquilaChat2-7B-16K"
 tokenizer = AutoTokenizer.from_pretrained(model_info, trust_remote_code=True)
-model = AutoModelForCausalLM.from_pretrained(model_info, trust_remote_code=True, torch_dtype=torch.bfloat16)
 model.eval()
 model.to(device)
 text = "请给出10个要到北京旅游的理由。"

 ### 1. Inference
 ```python
 import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from transformers import BitsAndBytesConfig
 device = torch.device("cuda:0")
 model_info = "BAAI/AquilaChat2-7B-16K"
 tokenizer = AutoTokenizer.from_pretrained(model_info, trust_remote_code=True)
+quantization_config=BitsAndBytesConfig(
+                        load_in_4bit=True,
+                        bnb_4bit_use_double_quant=True,
+                        bnb_4bit_quant_type="nf4",
+                        bnb_4bit_compute_dtype=torch.bfloat16,
+                    )
+model = AutoModelForCausalLM.from_pretrained(model_info, trust_remote_code=True, torch_dtype=torch.float16,
+                                                # quantization_config=quantization_config, # Uncomment this line for 4bit quantization
+                                                )
 model.eval()
 model.to(device)
 text = "请给出10个要到北京旅游的理由。"