Awan LLM commited on
Commit
88003dd
1 Parent(s): 78847a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -15,6 +15,11 @@ In terms of reasoning and intelligence, this model is probably worse than the OG
15
  Will soon have quants uploaded here on HF and have it up on our site https://awanllm.com for anyone to try.
16
 
17
 
 
 
 
 
 
18
  Training:
19
  - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
20
  - Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
 
15
  Will soon have quants uploaded here on HF and have it up on our site https://awanllm.com for anyone to try.
16
 
17
 
18
+ OpenLLM Benchmark:
19
+
20
+ ![OpenLLM Leaderboard](https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Cumulus-v0.2/blob/main/Screenshot%202024-05-02%20201231.png "OpenLLM Leaderboard")
21
+
22
+
23
  Training:
24
  - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
25
  - Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.