Update README.md
Browse files
README.md
CHANGED
@@ -4,12 +4,12 @@ language:
|
|
4 |
- he
|
5 |
inference: false
|
6 |
---
|
7 |
-
# DictaBERT-large-char-menaked: An open-source BERT-based model for adding diacritiziation to Hebrew texts
|
8 |
|
9 |
This model is a fine-tuned version of [DictaBERT-large-char](https://huggingface.co/dicta-il/dictabert-large-char), dedicated to the task of adding nikud (diacritics) to Hebrew text.
|
10 |
|
11 |
The model was trained on a corpus of Hebrew texts manually diacritized by linguistic experts.
|
12 |
-
|
13 |
|
14 |
Sample usage:
|
15 |
|
@@ -45,10 +45,6 @@ Output:
|
|
45 |
```
|
46 |
|
47 |
|
48 |
-
## Citation
|
49 |
-
|
50 |
-
TBD
|
51 |
-
|
52 |
## License
|
53 |
|
54 |
Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
|
|
|
4 |
- he
|
5 |
inference: false
|
6 |
---
|
7 |
+
# DictaBERT-large-char-menaked: An open-source BERT-based model for adding diacritiziation marks ("nikud") to Hebrew texts
|
8 |
|
9 |
This model is a fine-tuned version of [DictaBERT-large-char](https://huggingface.co/dicta-il/dictabert-large-char), dedicated to the task of adding nikud (diacritics) to Hebrew text.
|
10 |
|
11 |
The model was trained on a corpus of Hebrew texts manually diacritized by linguistic experts.
|
12 |
+
As of 2025-03, this model provides SOTA performance on all Hebrew vocalization benchmarks as compared to all other open-source alternatives, as well as when compared with commercial LLM alternatives.
|
13 |
|
14 |
Sample usage:
|
15 |
|
|
|
45 |
```
|
46 |
|
47 |
|
|
|
|
|
|
|
|
|
48 |
## License
|
49 |
|
50 |
Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
|