Update README.md
Browse files
README.md
CHANGED
@@ -25,17 +25,8 @@ This model is a fine-tuned version of [Qwen/Qwen3-4B-Base](https://huggingface.c
|
|
25 |
|
26 |
## Model description
|
27 |
|
28 |
-
|
29 |
-
|
30 |
-
## Intended uses & limitations
|
31 |
-
|
32 |
-
More information needed
|
33 |
-
|
34 |
-
## Training and evaluation data
|
35 |
-
|
36 |
-
More information needed
|
37 |
-
|
38 |
-
## Training procedure
|
39 |
|
40 |
### Training hyperparameters
|
41 |
|
@@ -55,6 +46,16 @@ The following hyperparameters were used during training:
|
|
55 |
|
56 |
### Training results
|
57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
|
59 |
|
60 |
### Framework versions
|
|
|
25 |
|
26 |
## Model description
|
27 |
|
28 |
+
Hex-1 is a 4-billion parameter language model specifically optimized for Indian languages. It supports five major Indian languages, including Hindi, Kannada, Telugu, Tamil and Malayalam.
|
29 |
+
When benchmarked against leading models like Gemma-2B, LLaMA-3.2-3B, and Sarvam-1, Hex1 delivers best-in-class performance in all five supported languages on MMLU dataset.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
### Training hyperparameters
|
32 |
|
|
|
46 |
|
47 |
### Training results
|
48 |
|
49 |
+
### Performance Comparison on ARCC Dataset
|
50 |
+
|
51 |
+
| Benchmark | Gemma-2-2B | Llama-3.2-3B | Llama-3.1-8B | Sarvam-1 | Hex-1 |
|
52 |
+
|-----------|------------|--------------|---------------|-----------|--------|
|
53 |
+
| arcc_hi | 37.57 | 49.13 | 56.17 | 60.00 | 36.68 |
|
54 |
+
| arcc_ta | 32.78 | 34.70 | 44.78 | 57.04 | 38.65 |
|
55 |
+
| arcc_te | 30.00 | 34.09 | 43.04 | 59.39 | 37.96 |
|
56 |
+
| arcc_kn | 29.22 | 36.43 | 44.70 | 57.04 | 38.31 |
|
57 |
+
| arcc_ml | 29.91 | 33.22 | 46.78 | 58.96 | 29.60 |
|
58 |
+
|
59 |
|
60 |
|
61 |
### Framework versions
|