OwenArli
/

ArliAI-Llama-3-8B-Cumulus-v0.2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Awan LLM commited on May 3

Commit

88003dd

•

1 Parent(s): 78847a3

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -15,6 +15,11 @@ In terms of reasoning and intelligence, this model is probably worse than the OG
 Will soon have quants uploaded here on HF and have it up on our site https://awanllm.com for anyone to try.
 Training:
 - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.

 Will soon have quants uploaded here on HF and have it up on our site https://awanllm.com for anyone to try.
+OpenLLM Benchmark:
+![OpenLLM Leaderboard](https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Cumulus-v0.2/blob/main/Screenshot%202024-05-02%20201231.png "OpenLLM Leaderboard")
 Training:
 - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Training duration is around 3 days on an RTX 4090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.