Update README.md
Browse files
README.md
CHANGED
@@ -5,6 +5,9 @@ language:
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
|
|
|
|
|
|
8 |

|
9 |
|
10 |
# AURORA-Tiny π
β¨
|
@@ -76,16 +79,6 @@ AURORA-Tiny uses a novel combination of:
|
|
76 |
- **Multi-Head Attention**: Standard transformer attention with time modulation
|
77 |
- **Output Projection**: Maps back to vocabulary space
|
78 |
|
79 |
-
## π Performance
|
80 |
-
|
81 |
-
AURORA-Tiny achieves competitive results on various text generation tasks despite its compact size:
|
82 |
-
|
83 |
-
| Dataset | Perplexity | BLEU Score | Training Time | Parameters |
|
84 |
-
|---------|------------|------------|---------------|------------|
|
85 |
-
| Movie Reviews | 28.1 | 0.38 | ~15 min | 2.4M |
|
86 |
-
| News Articles | 35.2 | 0.34 | ~20 min | 2.4M |
|
87 |
-
| Poetry | 23.6 | 0.31 | ~12 min | 2.4M |
|
88 |
-
|
89 |
*Tested on RTX 3060, batch_size=16, 15 epochs. Model size: ~2.4M parameters*
|
90 |
|
91 |
## ποΈ Configuration
|
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
8 |
+
> [!NOTE]
|
9 |
+
> Hii!!! This is a side project, so is not the best.
|
10 |
+
|
11 |

|
12 |
|
13 |
# AURORA-Tiny π
β¨
|
|
|
79 |
- **Multi-Head Attention**: Standard transformer attention with time modulation
|
80 |
- **Output Projection**: Maps back to vocabulary space
|
81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
82 |
*Tested on RTX 3060, batch_size=16, 15 epochs. Model size: ~2.4M parameters*
|
83 |
|
84 |
## ποΈ Configuration
|