DiscoResearch
/

DiscoLM-mixtral-8x7b-v2

Text Generation

text-generation-inference

Model card Files Files and versions Community

bjoernp commited on Dec 9, 2023

Commit

873cc81

·

1 Parent(s): fafd51e

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -13,6 +13,7 @@ Tags:
 - mixtral
 - moe
 - discoresearch
 ---
@@ -102,6 +103,22 @@ tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
 If you use `tokenize=True` and `return_tensors="pt"` instead, then you will get a tokenized and formatted conversation ready to pass to `model.generate()`.
 ## Datasets
 The following datasets were used for training DiscoLM Mixtral 8x7b alpha:

 - mixtral
 - moe
 - discoresearch
+license: apache-2.0
 ---
 If you use `tokenize=True` and `return_tensors="pt"` instead, then you will get a tokenized and formatted conversation ready to pass to `model.generate()`.
+Basic inference code:
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("DiscoResearch/DiscoLM-mixtral-8x7b-v2", low_cpu_mem_usage=True, device_map="auto", trust_remote_code=True)
+tok = AutoTokenizer.from_pretrained("DiscoResearch/DiscoLM-mixtral-8x7b-v2")
+chat = [
+  {"role": "system", "content": "You are DiscoLM, a helpful assistant."},
+  {"role": "user", "content": "Please tell me possible reasons to call a research collective Disco Research"}
+]
+x = tokenizer.apply_chat_template(chat, tokenize=True, return_tensors="pt", add_generation_prompt=True).cuda()
+x = model.generate(x, max_new_tokens=128).cpu()
+print(tok.batch_decode(x))
+```
 ## Datasets
 The following datasets were used for training DiscoLM Mixtral 8x7b alpha: