XiaoEnn commited on
Commit
88da8bb
·
verified ·
1 Parent(s): f3a7d00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -42
README.md CHANGED
@@ -1,22 +1,24 @@
1
  ---
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
4
- # Herberta: A Pretrained Model for TCM Herbal Medicine and Downstream Tasks
5
-
6
- **Tags**:
7
- - Pretrain_Model
8
- - transformers
9
- - TCM
10
- - herberta
11
- - text embedding
12
-
13
- **License**: Apache-2.0
14
- **Inference**: true
15
- **Language**: zh, en
16
- **Base Model**: hfl/chinese-roberta-wwm-ext
17
- **Library Name**: transformers
18
 
19
- ---
20
 
21
  ## Introduction
22
 
@@ -55,18 +57,6 @@ We named the model "Herberta" by combining "Herb" and "Roberta" to signify its p
55
  ![Loss](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png)
56
  ![Perplexity](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png)
57
 
58
- <!-- <table>
59
- <tr>
60
- <td align="center"><strong>Accuracy</strong></td>
61
- <td align="center"><strong>Loss</strong></td>
62
- <td align="center"><strong>Perplexity</strong></td>
63
- </tr>
64
- <tr>
65
- <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/RDgI-0Ro2kMiwV853Wkgx.png" alt="Accuracy" width="800"></td>
66
- <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png" alt="Loss" width="800"></td>
67
- <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png" alt="Perplexity" width="800"></td>
68
- </tr>
69
- </table> -->
70
 
71
  ### Pretraining Configuration
72
 
@@ -77,21 +67,6 @@ We named the model "Herberta" by combining "Herb" and "Roberta" to signify its p
77
  - Learning Rate: `1e-5` with an epoch-based decay (`epoch * 0.1`)
78
  - Tokenization: Sentence-based tokenization with padding for sequences <512 tokens.
79
 
80
- #### Modern Textbooks
81
- - Pretraining Strategy: Dynamic MASK + Warmup + Linear Decay
82
- - Sequence Length: 512
83
- - Batch Size: 16
84
- - Learning Rate: Warmup (10% steps) + Linear Decay (1e-5 initial rate)
85
- - Tokenization: Continuous tokenization (512 tokens) without sentence segmentation.
86
-
87
- #### V4 Mixed Dataset (Ancient + Modern)
88
- - Dataset: Combined 48 modern textbooks + 700 ancient books
89
- - Pretraining Strategy: Dynamic MASK, warmup, and linear decay (1e-5 learning rate).
90
- - Epochs: 20
91
- - Sequence Length: 512
92
- - Batch Size: 16
93
- - Tokenization: Continuous tokenization.
94
-
95
  ---
96
 
97
  ## Downstream Task: TCM Pattern Classification
 
1
  ---
2
+ tags:
3
+ - PretrainModel
4
+ - TCM
5
+ - transformer
6
+ - herberta
7
+ - text-embedding
8
  license: apache-2.0
9
+ language:
10
+ - zh
11
+ - en
12
+ metrics:
13
+ - accuracy
14
+ base_model:
15
+ - hfl/chinese-roberta-wwm-ext-large
16
+ new_version: XiaoEnn/herberta_seq_512_V2
17
+ inference: true
18
+ library_name: transformers
19
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
+ # Herberta: A Pretrained Model for TCM Herbal Medicine and Downstream Tasks
22
 
23
  ## Introduction
24
 
 
57
  ![Loss](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png)
58
  ![Perplexity](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png)
59
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
  ### Pretraining Configuration
62
 
 
67
  - Learning Rate: `1e-5` with an epoch-based decay (`epoch * 0.1`)
68
  - Tokenization: Sentence-based tokenization with padding for sequences <512 tokens.
69
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  ---
71
 
72
  ## Downstream Task: TCM Pattern Classification