Adding Evaluation Results

#2
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -17,4 +17,17 @@ Javalion-R is a research artefact with dual purpose for entertainment as well as
17
 
18
  Mileage mat vary. No refunds best wishes. Mainly intended to be utilized with Open Source KoboldAI software. Optimal sampler and settings not determined. Feedback Welcome!
19
 
20
- https://github.com/KoboldAI/KoboldAI-Client
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  Mileage mat vary. No refunds best wishes. Mainly intended to be utilized with Open Source KoboldAI software. Optimal sampler and settings not determined. Feedback Welcome!
19
 
20
+ https://github.com/KoboldAI/KoboldAI-Client
21
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
22
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_digitous__Javalion-R)
23
+
24
+ | Metric | Value |
25
+ |-----------------------|---------------------------|
26
+ | Avg. | 35.42 |
27
+ | ARC (25-shot) | 41.72 |
28
+ | HellaSwag (10-shot) | 68.02 |
29
+ | MMLU (5-shot) | 30.81 |
30
+ | TruthfulQA (0-shot) | 34.44 |
31
+ | Winogrande (5-shot) | 65.43 |
32
+ | GSM8K (5-shot) | 2.65 |
33
+ | DROP (3-shot) | 4.85 |