Commit
·
8fb1c05
1
Parent(s):
32965b1
Update README.md
Browse files
README.md
CHANGED
@@ -13,4 +13,42 @@ This model is build for [bnlp](https://github.com/sagorbrur/bnlp) package.
|
|
13 |
- Fasttext trained with total words = 20M, vocab size = 1171011, epoch=50, embedding dimension = 300
|
14 |
|
15 |
## Evaluation Details
|
16 |
-
- training loss = 0.318668
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
- Fasttext trained with total words = 20M, vocab size = 1171011, epoch=50, embedding dimension = 300
|
14 |
|
15 |
## Evaluation Details
|
16 |
+
- training loss = 0.318668
|
17 |
+
|
18 |
+
## Usage
|
19 |
+
- `pip install -U bnlp_toolkit`
|
20 |
+
- `pip install fasttext==0.9.2`
|
21 |
+
- Generate Vector Using Pretrained Model
|
22 |
+
```py
|
23 |
+
from bnlp.embedding.fasttext import BengaliFasttext
|
24 |
+
|
25 |
+
bft = BengaliFasttext()
|
26 |
+
word = "গ্রাম"
|
27 |
+
model_path = "bengali_fasttext_wiki.bin"
|
28 |
+
word_vector = bft.generate_word_vector(model_path, word)
|
29 |
+
print(word_vector.shape)
|
30 |
+
print(word_vector)
|
31 |
+
```
|
32 |
+
|
33 |
+
- Train Bengali FastText Model
|
34 |
+
|
35 |
+
```py
|
36 |
+
from bnlp.embedding.fasttext import BengaliFasttext
|
37 |
+
|
38 |
+
bft = BengaliFasttext()
|
39 |
+
data = "raw_text.txt"
|
40 |
+
model_name = "saved_model.bin"
|
41 |
+
epoch = 50
|
42 |
+
bft.train(data, model_name, epoch)
|
43 |
+
```
|
44 |
+
|
45 |
+
- Generate Vector File from Fasttext Binary Model
|
46 |
+
```py
|
47 |
+
from bnlp.embedding.fasttext import BengaliFasttext
|
48 |
+
|
49 |
+
bft = BengaliFasttext()
|
50 |
+
|
51 |
+
model_path = "mymodel.bin"
|
52 |
+
out_vector_name = "myvector.txt"
|
53 |
+
bft.bin2vec(model_path, out_vector_name)
|
54 |
+
```
|