conan1024hao commited on
Commit
c2dbdac
1 Parent(s): fea7bb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -14,7 +14,7 @@ widget:
14
  ---
15
 
16
  ### Model description
17
- This model was trained on ZH, JA, KO's Wikipedia (5 epochs).
18
 
19
  ### How to use
20
  ```python
@@ -22,7 +22,8 @@ from transformers import AutoTokenizer, AutoModelForMaskedLM
22
  tokenizer = AutoTokenizer.from_pretrained("conan1024hao/cjkbert-small")
23
  model = AutoModelForMaskedLM.from_pretrained("conan1024hao/cjkbert-small")
24
  ```
25
- Before you fine-tune downstream tasks, you don't need any text segmentation. (Though you may obtain better results if you applied morphological analysis to the data before fine-tuning.)
 
26
 
27
  ### Morphological analysis tools
28
  - ZH: For Chinese, we use [LTP](https://github.com/HIT-SCIR/ltp).
@@ -30,7 +31,7 @@ Before you fine-tune downstream tasks, you don't need any text segmentation. (Th
30
  - KO: For Korean, we use [KoNLPy](https://github.com/konlpy/konlpy)(Kkma class).
31
 
32
  ### Tokenization
33
- We use character-based tokenization with whole-word-masking strategy.
34
 
35
  ### Model size
36
  - vocab_size: 15015
 
14
  ---
15
 
16
  ### Model description
17
+ - This model was trained on **ZH, JA, KO**'s Wikipedia (5 epochs).
18
 
19
  ### How to use
20
  ```python
 
22
  tokenizer = AutoTokenizer.from_pretrained("conan1024hao/cjkbert-small")
23
  model = AutoModelForMaskedLM.from_pretrained("conan1024hao/cjkbert-small")
24
  ```
25
+ - Before you fine-tune downstream tasks, you don't need any text segmentation.
26
+ - (Though you may obtain better results if you applied morphological analysis to the data before fine-tuning.)
27
 
28
  ### Morphological analysis tools
29
  - ZH: For Chinese, we use [LTP](https://github.com/HIT-SCIR/ltp).
 
31
  - KO: For Korean, we use [KoNLPy](https://github.com/konlpy/konlpy)(Kkma class).
32
 
33
  ### Tokenization
34
+ - We use character-based tokenization with **whole-word-masking** strategy.
35
 
36
  ### Model size
37
  - vocab_size: 15015