KnutJaegersberg
/

Deacon-34B-200k-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

KnutJaegersberg commited on Nov 13, 2023

Commit

ef26f36

·

1 Parent(s): a5231a1

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -12,6 +12,16 @@ In this case the tokenizer is the yi_tokenizer, loading it requires trust_remote
 Have some fun with this fellow.
 License

 Have some fun with this fellow.
+It can eat a lot of vram, depends on settings to make it useable on two 24 gb vram gpus:
+Without fused attention, it's 27 gb vram, will need some if if yo do stuff.
+![image.png](https://cdn-uploads.huggingface.co/production/uploads/63732ebbbd81fae2b3aaf3fb/1cbqKp55WhN4BQD337E-n.png)
+You can also let if have fused attention and just reduce the max_seq_length to something way smaller yet still useful
+![image.png](https://cdn-uploads.huggingface.co/production/uploads/63732ebbbd81fae2b3aaf3fb/JRi4sakPziGpmOFCBfcJS.png)
 License