KoichiYasuoka commited on
Commit
1dcc374
·
1 Parent(s): 79ddf2b

initial release

Browse files
Files changed (9) hide show
  1. README.md +33 -0
  2. config.json +1629 -0
  3. maker.py +118 -0
  4. pytorch_model.bin +3 -0
  5. special_tokens_map.json +7 -0
  6. tokenizer.json +0 -0
  7. tokenizer_config.json +62 -0
  8. ud.py +150 -0
  9. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - "lzh"
4
+ tags:
5
+ - "classical chinese"
6
+ - "literary chinese"
7
+ - "ancient chinese"
8
+ - "token-classification"
9
+ - "pos"
10
+ - "dependency-parsing"
11
+ base_model: Jihuai/bert-ancient-chinese
12
+ datasets:
13
+ - "universal_dependencies"
14
+ license: "apache-2.0"
15
+ pipeline_tag: "token-classification"
16
+ widget:
17
+ - text: "孟子見梁惠王"
18
+ ---
19
+
20
+ # bert-ancient-chinese-base-ud-embeds
21
+
22
+ ## Model Description
23
+
24
+ This is a BERT model pre-trained on Classical Chinese texts for POS-tagging and dependency-parsing, derived from [bert-ancient-chinese](https://huggingface.co/Jihuai/bert-ancient-chinese) and [UD_Classical_Chinese-Kyoto](https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto).
25
+
26
+ ## How to Use
27
+
28
+ ```py
29
+ from transformers import pipeline
30
+ nlp=pipeline("universal-dependencies","KoichiYasuoka/bert-ancient-chinese-base-ud-embeds",trust_remote_code=True)
31
+ print(nlp("孟子見梁惠王"))
32
+ ```
33
+
config.json ADDED
@@ -0,0 +1,1629 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForTokenClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "custom_pipelines": {
8
+ "upos": {
9
+ "impl": "ud.BellmanFordTokenClassificationPipeline",
10
+ "pt": "AutoModelForTokenClassification"
11
+ },
12
+ "universal-dependencies": {
13
+ "impl": "ud.UniversalDependenciesPipeline",
14
+ "pt": "AutoModelForTokenClassification"
15
+ }
16
+ },
17
+ "directionality": "bidi",
18
+ "hidden_act": "gelu",
19
+ "hidden_dropout_prob": 0.1,
20
+ "hidden_size": 768,
21
+ "id2label": {
22
+ "0": "ADP",
23
+ "1": "ADP.",
24
+ "2": "ADP|Degree=Equ|_",
25
+ "3": "ADP|Degree=Equ|l-cc",
26
+ "4": "ADP|_",
27
+ "5": "ADP|l-acl",
28
+ "6": "ADP|l-advcl",
29
+ "7": "ADP|l-amod",
30
+ "8": "ADP|l-case",
31
+ "9": "ADP|l-cc",
32
+ "10": "ADP|l-mark",
33
+ "11": "ADP|l-nsubj",
34
+ "12": "ADP|l-obl",
35
+ "13": "ADP|r-case",
36
+ "14": "ADP|r-conj",
37
+ "15": "ADP|r-fixed",
38
+ "16": "ADP|r-mark",
39
+ "17": "ADP|r-obj",
40
+ "18": "ADP|root",
41
+ "19": "ADV",
42
+ "20": "ADV.",
43
+ "21": "ADV|AdvType=Cau|_",
44
+ "22": "ADV|AdvType=Cau|l-advmod",
45
+ "23": "ADV|AdvType=Cau|l-amod",
46
+ "24": "ADV|AdvType=Cau|l-nsubj",
47
+ "25": "ADV|AdvType=Cau|l-obj",
48
+ "26": "ADV|AdvType=Deg|Degree=Cmp|_",
49
+ "27": "ADV|AdvType=Deg|Degree=Cmp|l-advmod",
50
+ "28": "ADV|AdvType=Deg|Degree=Cmp|l-amod",
51
+ "29": "ADV|AdvType=Deg|Degree=Cmp|r-conj",
52
+ "30": "ADV|AdvType=Deg|Degree=Cmp|r-obj",
53
+ "31": "ADV|AdvType=Deg|Degree=Pos|_",
54
+ "32": "ADV|AdvType=Deg|Degree=Pos|l-advmod",
55
+ "33": "ADV|AdvType=Deg|Degree=Pos|l-amod",
56
+ "34": "ADV|AdvType=Deg|Degree=Pos|r-ccomp",
57
+ "35": "ADV|AdvType=Deg|Degree=Pos|r-conj",
58
+ "36": "ADV|AdvType=Deg|Degree=Pos|r-flat:vv",
59
+ "37": "ADV|AdvType=Deg|Degree=Pos|r-parataxis",
60
+ "38": "ADV|AdvType=Deg|Degree=Pos|root",
61
+ "39": "ADV|AdvType=Deg|Degree=Sup|_",
62
+ "40": "ADV|AdvType=Deg|Degree=Sup|l-advmod",
63
+ "41": "ADV|AdvType=Deg|Degree=Sup|l-amod",
64
+ "42": "ADV|AdvType=Deg|Degree=Sup|l-nsubj",
65
+ "43": "ADV|AdvType=Deg|Degree=Sup|r-conj",
66
+ "44": "ADV|AdvType=Deg|Degree=Sup|r-parataxis",
67
+ "45": "ADV|AdvType=Deg|Degree=Sup|root",
68
+ "46": "ADV|AdvType=Tim|Aspect=Perf|_",
69
+ "47": "ADV|AdvType=Tim|Aspect=Perf|l-advmod",
70
+ "48": "ADV|AdvType=Tim|Aspect=Perf|l-amod",
71
+ "49": "ADV|AdvType=Tim|Aspect=Perf|l-obl:lmod",
72
+ "50": "ADV|AdvType=Tim|Aspect=Perf|r-parataxis",
73
+ "51": "ADV|AdvType=Tim|Aspect=Perf|root",
74
+ "52": "ADV|AdvType=Tim|Tense=Fut|_",
75
+ "53": "ADV|AdvType=Tim|Tense=Fut|l-advmod",
76
+ "54": "ADV|AdvType=Tim|Tense=Fut|l-amod",
77
+ "55": "ADV|AdvType=Tim|Tense=Fut|l-nsubj",
78
+ "56": "ADV|AdvType=Tim|Tense=Fut|l-nsubj:outer",
79
+ "57": "ADV|AdvType=Tim|Tense=Fut|root",
80
+ "58": "ADV|AdvType=Tim|Tense=Past|_",
81
+ "59": "ADV|AdvType=Tim|Tense=Past|l-advmod",
82
+ "60": "ADV|AdvType=Tim|Tense=Past|l-amod",
83
+ "61": "ADV|AdvType=Tim|Tense=Pres|_",
84
+ "62": "ADV|AdvType=Tim|Tense=Pres|l-advmod",
85
+ "63": "ADV|AdvType=Tim|Tense=Pres|l-amod",
86
+ "64": "ADV|AdvType=Tim|Tense=Pres|root",
87
+ "65": "ADV|AdvType=Tim|_",
88
+ "66": "ADV|AdvType=Tim|l-advcl",
89
+ "67": "ADV|AdvType=Tim|l-advmod",
90
+ "68": "ADV|AdvType=Tim|l-amod",
91
+ "69": "ADV|AdvType=Tim|l-nsubj",
92
+ "70": "ADV|AdvType=Tim|r-advmod",
93
+ "71": "ADV|AdvType=Tim|r-ccomp",
94
+ "72": "ADV|AdvType=Tim|r-compound:redup",
95
+ "73": "ADV|AdvType=Tim|r-conj",
96
+ "74": "ADV|AdvType=Tim|r-flat:vv",
97
+ "75": "ADV|AdvType=Tim|r-parataxis",
98
+ "76": "ADV|AdvType=Tim|root",
99
+ "77": "ADV|Degree=Equ|VerbForm=Conv|_",
100
+ "78": "ADV|Degree=Equ|VerbForm=Conv|l-advmod",
101
+ "79": "ADV|Degree=Pos|VerbForm=Conv|_",
102
+ "80": "ADV|Degree=Pos|VerbForm=Conv|l-advmod",
103
+ "81": "ADV|Degree=Pos|VerbForm=Conv|r-advmod",
104
+ "82": "ADV|Polarity=Neg|VerbForm=Conv|_",
105
+ "83": "ADV|Polarity=Neg|VerbForm=Conv|l-advmod",
106
+ "84": "ADV|Polarity=Neg|_",
107
+ "85": "ADV|Polarity=Neg|l-advmod",
108
+ "86": "ADV|Polarity=Neg|l-amod",
109
+ "87": "ADV|Polarity=Neg|l-nsubj",
110
+ "88": "ADV|Polarity=Neg|l-parataxis",
111
+ "89": "ADV|Polarity=Neg|r-advmod",
112
+ "90": "ADV|Polarity=Neg|r-conj",
113
+ "91": "ADV|Polarity=Neg|r-obj",
114
+ "92": "ADV|Polarity=Neg|r-parataxis",
115
+ "93": "ADV|Polarity=Neg|root",
116
+ "94": "ADV|VerbForm=Conv|_",
117
+ "95": "ADV|VerbForm=Conv|l-advmod",
118
+ "96": "ADV|VerbForm=Conv|r-advmod",
119
+ "97": "ADV|_",
120
+ "98": "ADV|l-acl",
121
+ "99": "ADV|l-advcl",
122
+ "100": "ADV|l-advmod",
123
+ "101": "ADV|l-amod",
124
+ "102": "ADV|l-cc",
125
+ "103": "ADV|l-nsubj",
126
+ "104": "ADV|r-advmod",
127
+ "105": "ADV|r-ccomp",
128
+ "106": "ADV|r-conj",
129
+ "107": "ADV|r-flat:vv",
130
+ "108": "ADV|r-obj",
131
+ "109": "ADV|root",
132
+ "110": "AUX",
133
+ "111": "AUX.",
134
+ "112": "AUX|Mood=Des|_",
135
+ "113": "AUX|Mood=Des|l-aux",
136
+ "114": "AUX|Mood=Des|l-csubj",
137
+ "115": "AUX|Mood=Des|l-parataxis",
138
+ "116": "AUX|Mood=Des|r-ccomp",
139
+ "117": "AUX|Mood=Des|r-conj",
140
+ "118": "AUX|Mood=Des|r-flat:vv",
141
+ "119": "AUX|Mood=Des|root",
142
+ "120": "AUX|Mood=Nec|_",
143
+ "121": "AUX|Mood=Nec|l-acl",
144
+ "122": "AUX|Mood=Nec|l-amod",
145
+ "123": "AUX|Mood=Nec|l-aux",
146
+ "124": "AUX|Mood=Nec|r-aux",
147
+ "125": "AUX|Mood=Nec|root",
148
+ "126": "AUX|Mood=Pot|_",
149
+ "127": "AUX|Mood=Pot|l-acl",
150
+ "128": "AUX|Mood=Pot|l-advcl",
151
+ "129": "AUX|Mood=Pot|l-amod",
152
+ "130": "AUX|Mood=Pot|l-aux",
153
+ "131": "AUX|Mood=Pot|l-csubj",
154
+ "132": "AUX|Mood=Pot|l-nsubj",
155
+ "133": "AUX|Mood=Pot|r-ccomp",
156
+ "134": "AUX|Mood=Pot|r-conj",
157
+ "135": "AUX|Mood=Pot|r-obj",
158
+ "136": "AUX|Mood=Pot|r-parataxis",
159
+ "137": "AUX|Mood=Pot|r-xcomp",
160
+ "138": "AUX|Mood=Pot|root",
161
+ "139": "AUX|VerbType=Cop|_",
162
+ "140": "AUX|VerbType=Cop|l-cop",
163
+ "141": "AUX|Voice=Pass|_",
164
+ "142": "AUX|Voice=Pass|l-aux",
165
+ "143": "AUX|Voice=Pass|r-conj",
166
+ "144": "AUX|Voice=Pass|root",
167
+ "145": "B-ADP",
168
+ "146": "B-ADP.",
169
+ "147": "B-ADV",
170
+ "148": "B-ADV.",
171
+ "149": "B-AUX",
172
+ "150": "B-AUX.",
173
+ "151": "B-CCONJ",
174
+ "152": "B-CCONJ.",
175
+ "153": "B-INTJ",
176
+ "154": "B-INTJ.",
177
+ "155": "B-NOUN",
178
+ "156": "B-NOUN.",
179
+ "157": "B-NUM",
180
+ "158": "B-NUM.",
181
+ "159": "B-PART",
182
+ "160": "B-PART.",
183
+ "161": "B-PRON",
184
+ "162": "B-PRON.",
185
+ "163": "B-PROPN",
186
+ "164": "B-PROPN.",
187
+ "165": "B-PUNCT",
188
+ "166": "B-PUNCT.",
189
+ "167": "B-SCONJ",
190
+ "168": "B-SCONJ.",
191
+ "169": "B-SYM",
192
+ "170": "B-SYM.",
193
+ "171": "B-VERB",
194
+ "172": "B-VERB.",
195
+ "173": "CCONJ",
196
+ "174": "CCONJ.",
197
+ "175": "CCONJ|_",
198
+ "176": "CCONJ|l-advmod",
199
+ "177": "CCONJ|l-amod",
200
+ "178": "CCONJ|l-cc",
201
+ "179": "CCONJ|l-obj",
202
+ "180": "CCONJ|r-fixed",
203
+ "181": "CCONJ|r-orphan",
204
+ "182": "I-ADP",
205
+ "183": "I-ADP.",
206
+ "184": "I-ADV",
207
+ "185": "I-ADV.",
208
+ "186": "I-AUX",
209
+ "187": "I-AUX.",
210
+ "188": "I-CCONJ",
211
+ "189": "I-CCONJ.",
212
+ "190": "I-INTJ",
213
+ "191": "I-INTJ.",
214
+ "192": "I-NOUN",
215
+ "193": "I-NOUN.",
216
+ "194": "I-NUM",
217
+ "195": "I-NUM.",
218
+ "196": "I-PART",
219
+ "197": "I-PART.",
220
+ "198": "I-PRON",
221
+ "199": "I-PRON.",
222
+ "200": "I-PROPN",
223
+ "201": "I-PROPN.",
224
+ "202": "I-PUNCT",
225
+ "203": "I-PUNCT.",
226
+ "204": "I-SCONJ",
227
+ "205": "I-SCONJ.",
228
+ "206": "I-SYM",
229
+ "207": "I-SYM.",
230
+ "208": "I-VERB",
231
+ "209": "I-VERB.",
232
+ "210": "INTJ",
233
+ "211": "INTJ.",
234
+ "212": "INTJ|_",
235
+ "213": "INTJ|l-advcl",
236
+ "214": "INTJ|l-csubj",
237
+ "215": "INTJ|l-discourse",
238
+ "216": "INTJ|l-discourse:sp",
239
+ "217": "INTJ|l-dislocated",
240
+ "218": "INTJ|l-nsubj",
241
+ "219": "INTJ|l-vocative",
242
+ "220": "INTJ|r-compound:redup",
243
+ "221": "INTJ|r-conj",
244
+ "222": "INTJ|r-discourse:sp",
245
+ "223": "INTJ|r-dislocated",
246
+ "224": "INTJ|r-fixed",
247
+ "225": "INTJ|r-obj",
248
+ "226": "INTJ|r-parataxis",
249
+ "227": "INTJ|root",
250
+ "228": "NOUN",
251
+ "229": "NOUN.",
252
+ "230": "NOUN|Case=Loc|_",
253
+ "231": "NOUN|Case=Loc|l-acl",
254
+ "232": "NOUN|Case=Loc|l-advcl",
255
+ "233": "NOUN|Case=Loc|l-amod",
256
+ "234": "NOUN|Case=Loc|l-clf",
257
+ "235": "NOUN|Case=Loc|l-compound",
258
+ "236": "NOUN|Case=Loc|l-csubj",
259
+ "237": "NOUN|Case=Loc|l-dislocated",
260
+ "238": "NOUN|Case=Loc|l-nmod",
261
+ "239": "NOUN|Case=Loc|l-nsubj",
262
+ "240": "NOUN|Case=Loc|l-nsubj:outer",
263
+ "241": "NOUN|Case=Loc|l-obj",
264
+ "242": "NOUN|Case=Loc|l-obl",
265
+ "243": "NOUN|Case=Loc|l-obl:lmod",
266
+ "244": "NOUN|Case=Loc|l-obl:tmod",
267
+ "245": "NOUN|Case=Loc|l-parataxis",
268
+ "246": "NOUN|Case=Loc|r-ccomp",
269
+ "247": "NOUN|Case=Loc|r-clf",
270
+ "248": "NOUN|Case=Loc|r-compound:redup",
271
+ "249": "NOUN|Case=Loc|r-conj",
272
+ "250": "NOUN|Case=Loc|r-dislocated",
273
+ "251": "NOUN|Case=Loc|r-flat",
274
+ "252": "NOUN|Case=Loc|r-iobj",
275
+ "253": "NOUN|Case=Loc|r-list",
276
+ "254": "NOUN|Case=Loc|r-nmod",
277
+ "255": "NOUN|Case=Loc|r-nsubj",
278
+ "256": "NOUN|Case=Loc|r-obj",
279
+ "257": "NOUN|Case=Loc|r-obl",
280
+ "258": "NOUN|Case=Loc|r-obl:lmod",
281
+ "259": "NOUN|Case=Loc|r-parataxis",
282
+ "260": "NOUN|Case=Loc|r-xcomp",
283
+ "261": "NOUN|Case=Loc|root",
284
+ "262": "NOUN|Case=Tem|_",
285
+ "263": "NOUN|Case=Tem|l-acl",
286
+ "264": "NOUN|Case=Tem|l-advcl",
287
+ "265": "NOUN|Case=Tem|l-amod",
288
+ "266": "NOUN|Case=Tem|l-compound",
289
+ "267": "NOUN|Case=Tem|l-csubj",
290
+ "268": "NOUN|Case=Tem|l-nmod",
291
+ "269": "NOUN|Case=Tem|l-nsubj",
292
+ "270": "NOUN|Case=Tem|l-nsubj:outer",
293
+ "271": "NOUN|Case=Tem|l-obj",
294
+ "272": "NOUN|Case=Tem|l-obl:tmod",
295
+ "273": "NOUN|Case=Tem|r-amod",
296
+ "274": "NOUN|Case=Tem|r-ccomp",
297
+ "275": "NOUN|Case=Tem|r-clf",
298
+ "276": "NOUN|Case=Tem|r-compound:redup",
299
+ "277": "NOUN|Case=Tem|r-conj",
300
+ "278": "NOUN|Case=Tem|r-flat",
301
+ "279": "NOUN|Case=Tem|r-iobj",
302
+ "280": "NOUN|Case=Tem|r-list",
303
+ "281": "NOUN|Case=Tem|r-nsubj",
304
+ "282": "NOUN|Case=Tem|r-obj",
305
+ "283": "NOUN|Case=Tem|r-obl:tmod",
306
+ "284": "NOUN|Case=Tem|r-parataxis",
307
+ "285": "NOUN|Case=Tem|r-xcomp",
308
+ "286": "NOUN|Case=Tem|root",
309
+ "287": "NOUN|Degree=Pos|_",
310
+ "288": "NOUN|Degree=Pos|root",
311
+ "289": "NOUN|NounType=Clf|_",
312
+ "290": "NOUN|NounType=Clf|l-clf",
313
+ "291": "NOUN|NounType=Clf|l-nmod",
314
+ "292": "NOUN|NounType=Clf|l-nsubj",
315
+ "293": "NOUN|NounType=Clf|l-obl",
316
+ "294": "NOUN|NounType=Clf|r-ccomp",
317
+ "295": "NOUN|NounType=Clf|r-clf",
318
+ "296": "NOUN|NounType=Clf|r-compound:redup",
319
+ "297": "NOUN|NounType=Clf|r-conj",
320
+ "298": "NOUN|NounType=Clf|r-flat",
321
+ "299": "NOUN|NounType=Clf|r-obj",
322
+ "300": "NOUN|NounType=Clf|r-parataxis",
323
+ "301": "NOUN|NounType=Clf|root",
324
+ "302": "NOUN|_",
325
+ "303": "NOUN|l-acl",
326
+ "304": "NOUN|l-advcl",
327
+ "305": "NOUN|l-amod",
328
+ "306": "NOUN|l-ccomp",
329
+ "307": "NOUN|l-clf",
330
+ "308": "NOUN|l-compound",
331
+ "309": "NOUN|l-csubj",
332
+ "310": "NOUN|l-csubj:outer",
333
+ "311": "NOUN|l-dislocated",
334
+ "312": "NOUN|l-iobj",
335
+ "313": "NOUN|l-list",
336
+ "314": "NOUN|l-nmod",
337
+ "315": "NOUN|l-nsubj",
338
+ "316": "NOUN|l-nsubj:outer",
339
+ "317": "NOUN|l-nsubj:pass",
340
+ "318": "NOUN|l-obj",
341
+ "319": "NOUN|l-obl",
342
+ "320": "NOUN|l-obl:lmod",
343
+ "321": "NOUN|l-obl:tmod",
344
+ "322": "NOUN|l-vocative",
345
+ "323": "NOUN|r-acl",
346
+ "324": "NOUN|r-advcl",
347
+ "325": "NOUN|r-amod",
348
+ "326": "NOUN|r-ccomp",
349
+ "327": "NOUN|r-clf",
350
+ "328": "NOUN|r-compound:redup",
351
+ "329": "NOUN|r-conj",
352
+ "330": "NOUN|r-csubj",
353
+ "331": "NOUN|r-dislocated",
354
+ "332": "NOUN|r-flat",
355
+ "333": "NOUN|r-flat:foreign",
356
+ "334": "NOUN|r-iobj",
357
+ "335": "NOUN|r-list",
358
+ "336": "NOUN|r-nmod",
359
+ "337": "NOUN|r-nsubj",
360
+ "338": "NOUN|r-obj",
361
+ "339": "NOUN|r-obl",
362
+ "340": "NOUN|r-obl:lmod",
363
+ "341": "NOUN|r-parataxis",
364
+ "342": "NOUN|r-vocative",
365
+ "343": "NOUN|r-xcomp",
366
+ "344": "NOUN|root",
367
+ "345": "NUM",
368
+ "346": "NUM.",
369
+ "347": "NUM|NumType=Ord|_",
370
+ "348": "NUM|NumType=Ord|l-nsubj",
371
+ "349": "NUM|NumType=Ord|l-nummod",
372
+ "350": "NUM|NumType=Ord|l-obl",
373
+ "351": "NUM|NumType=Ord|l-obl:lmod",
374
+ "352": "NUM|NumType=Ord|l-obl:tmod",
375
+ "353": "NUM|NumType=Ord|r-conj",
376
+ "354": "NUM|NumType=Ord|r-flat",
377
+ "355": "NUM|NumType=Ord|r-obj",
378
+ "356": "NUM|NumType=Ord|root",
379
+ "357": "NUM|_",
380
+ "358": "NUM|l-acl",
381
+ "359": "NUM|l-advcl",
382
+ "360": "NUM|l-compound",
383
+ "361": "NUM|l-csubj",
384
+ "362": "NUM|l-dislocated",
385
+ "363": "NUM|l-nsubj",
386
+ "364": "NUM|l-nsubj:outer",
387
+ "365": "NUM|l-nummod",
388
+ "366": "NUM|l-obj",
389
+ "367": "NUM|l-obl",
390
+ "368": "NUM|l-obl:lmod",
391
+ "369": "NUM|l-obl:tmod",
392
+ "370": "NUM|r-ccomp",
393
+ "371": "NUM|r-clf",
394
+ "372": "NUM|r-compound",
395
+ "373": "NUM|r-compound:redup",
396
+ "374": "NUM|r-conj",
397
+ "375": "NUM|r-flat",
398
+ "376": "NUM|r-iobj",
399
+ "377": "NUM|r-list",
400
+ "378": "NUM|r-nummod",
401
+ "379": "NUM|r-obj",
402
+ "380": "NUM|r-obl",
403
+ "381": "NUM|r-obl:tmod",
404
+ "382": "NUM|r-parataxis",
405
+ "383": "NUM|r-xcomp",
406
+ "384": "NUM|root",
407
+ "385": "PART",
408
+ "386": "PART.",
409
+ "387": "PART|_",
410
+ "388": "PART|l-acl",
411
+ "389": "PART|l-advcl",
412
+ "390": "PART|l-advmod",
413
+ "391": "PART|l-amod",
414
+ "392": "PART|l-case",
415
+ "393": "PART|l-cc",
416
+ "394": "PART|l-csubj",
417
+ "395": "PART|l-csubj:outer",
418
+ "396": "PART|l-discourse",
419
+ "397": "PART|l-discourse:sp",
420
+ "398": "PART|l-dislocated",
421
+ "399": "PART|l-mark",
422
+ "400": "PART|l-nmod",
423
+ "401": "PART|l-nsubj",
424
+ "402": "PART|l-nsubj:outer",
425
+ "403": "PART|l-nsubj:pass",
426
+ "404": "PART|l-obj",
427
+ "405": "PART|l-obl",
428
+ "406": "PART|l-obl:lmod",
429
+ "407": "PART|r-advmod",
430
+ "408": "PART|r-case",
431
+ "409": "PART|r-ccomp",
432
+ "410": "PART|r-clf",
433
+ "411": "PART|r-conj",
434
+ "412": "PART|r-discourse",
435
+ "413": "PART|r-discourse:sp",
436
+ "414": "PART|r-dislocated",
437
+ "415": "PART|r-fixed",
438
+ "416": "PART|r-flat",
439
+ "417": "PART|r-iobj",
440
+ "418": "PART|r-list",
441
+ "419": "PART|r-mark",
442
+ "420": "PART|r-nsubj",
443
+ "421": "PART|r-obj",
444
+ "422": "PART|r-obl",
445
+ "423": "PART|r-parataxis",
446
+ "424": "PART|r-xcomp",
447
+ "425": "PART|root",
448
+ "426": "PRON",
449
+ "427": "PRON.",
450
+ "428": "PRON|Person=1|PronType=Prs|_",
451
+ "429": "PRON|Person=1|PronType=Prs|l-acl",
452
+ "430": "PRON|Person=1|PronType=Prs|l-advcl",
453
+ "431": "PRON|Person=1|PronType=Prs|l-det",
454
+ "432": "PRON|Person=1|PronType=Prs|l-iobj",
455
+ "433": "PRON|Person=1|PronType=Prs|l-nsubj",
456
+ "434": "PRON|Person=1|PronType=Prs|l-nsubj:outer",
457
+ "435": "PRON|Person=1|PronType=Prs|l-obj",
458
+ "436": "PRON|Person=1|PronType=Prs|l-obl",
459
+ "437": "PRON|Person=1|PronType=Prs|l-vocative",
460
+ "438": "PRON|Person=1|PronType=Prs|r-ccomp",
461
+ "439": "PRON|Person=1|PronType=Prs|r-conj",
462
+ "440": "PRON|Person=1|PronType=Prs|r-iobj",
463
+ "441": "PRON|Person=1|PronType=Prs|r-nsubj",
464
+ "442": "PRON|Person=1|PronType=Prs|r-obj",
465
+ "443": "PRON|Person=1|PronType=Prs|r-obl",
466
+ "444": "PRON|Person=1|PronType=Prs|r-obl:lmod",
467
+ "445": "PRON|Person=1|PronType=Prs|root",
468
+ "446": "PRON|Person=2|PronType=Prs|_",
469
+ "447": "PRON|Person=2|PronType=Prs|l-advcl",
470
+ "448": "PRON|Person=2|PronType=Prs|l-amod",
471
+ "449": "PRON|Person=2|PronType=Prs|l-det",
472
+ "450": "PRON|Person=2|PronType=Prs|l-nmod",
473
+ "451": "PRON|Person=2|PronType=Prs|l-nsubj",
474
+ "452": "PRON|Person=2|PronType=Prs|l-nsubj:outer",
475
+ "453": "PRON|Person=2|PronType=Prs|l-obj",
476
+ "454": "PRON|Person=2|PronType=Prs|l-obl",
477
+ "455": "PRON|Person=2|PronType=Prs|l-vocative",
478
+ "456": "PRON|Person=2|PronType=Prs|r-conj",
479
+ "457": "PRON|Person=2|PronType=Prs|r-flat",
480
+ "458": "PRON|Person=2|PronType=Prs|r-iobj",
481
+ "459": "PRON|Person=2|PronType=Prs|r-obj",
482
+ "460": "PRON|Person=2|PronType=Prs|r-obl",
483
+ "461": "PRON|Person=2|PronType=Prs|root",
484
+ "462": "PRON|Person=3|PronType=Prs|_",
485
+ "463": "PRON|Person=3|PronType=Prs|l-advcl",
486
+ "464": "PRON|Person=3|PronType=Prs|l-amod",
487
+ "465": "PRON|Person=3|PronType=Prs|l-det",
488
+ "466": "PRON|Person=3|PronType=Prs|l-dislocated",
489
+ "467": "PRON|Person=3|PronType=Prs|l-expl",
490
+ "468": "PRON|Person=3|PronType=Prs|l-iobj",
491
+ "469": "PRON|Person=3|PronType=Prs|l-nsubj",
492
+ "470": "PRON|Person=3|PronType=Prs|l-nsubj:outer",
493
+ "471": "PRON|Person=3|PronType=Prs|l-nsubj:pass",
494
+ "472": "PRON|Person=3|PronType=Prs|l-obj",
495
+ "473": "PRON|Person=3|PronType=Prs|l-obl",
496
+ "474": "PRON|Person=3|PronType=Prs|r-ccomp",
497
+ "475": "PRON|Person=3|PronType=Prs|r-conj",
498
+ "476": "PRON|Person=3|PronType=Prs|r-expl",
499
+ "477": "PRON|Person=3|PronType=Prs|r-iobj",
500
+ "478": "PRON|Person=3|PronType=Prs|r-nsubj",
501
+ "479": "PRON|Person=3|PronType=Prs|r-obj",
502
+ "480": "PRON|Person=3|PronType=Prs|r-obl",
503
+ "481": "PRON|Person=3|PronType=Prs|root",
504
+ "482": "PRON|PronType=Dem|_",
505
+ "483": "PRON|PronType=Dem|l-acl",
506
+ "484": "PRON|PronType=Dem|l-advcl",
507
+ "485": "PRON|PronType=Dem|l-amod",
508
+ "486": "PRON|PronType=Dem|l-compound",
509
+ "487": "PRON|PronType=Dem|l-det",
510
+ "488": "PRON|PronType=Dem|l-dislocated",
511
+ "489": "PRON|PronType=Dem|l-expl",
512
+ "490": "PRON|PronType=Dem|l-nsubj",
513
+ "491": "PRON|PronType=Dem|l-nsubj:outer",
514
+ "492": "PRON|PronType=Dem|l-obj",
515
+ "493": "PRON|PronType=Dem|l-obl",
516
+ "494": "PRON|PronType=Dem|l-obl:lmod",
517
+ "495": "PRON|PronType=Dem|r-conj",
518
+ "496": "PRON|PronType=Dem|r-det",
519
+ "497": "PRON|PronType=Dem|r-expl",
520
+ "498": "PRON|PronType=Dem|r-flat",
521
+ "499": "PRON|PronType=Dem|r-iobj",
522
+ "500": "PRON|PronType=Dem|r-obj",
523
+ "501": "PRON|PronType=Dem|r-obl",
524
+ "502": "PRON|PronType=Dem|r-obl:lmod",
525
+ "503": "PRON|PronType=Dem|root",
526
+ "504": "PRON|PronType=Int|_",
527
+ "505": "PRON|PronType=Int|l-advcl",
528
+ "506": "PRON|PronType=Int|l-amod",
529
+ "507": "PRON|PronType=Int|l-det",
530
+ "508": "PRON|PronType=Int|l-dislocated",
531
+ "509": "PRON|PronType=Int|l-nsubj",
532
+ "510": "PRON|PronType=Int|l-nsubj:outer",
533
+ "511": "PRON|PronType=Int|l-obj",
534
+ "512": "PRON|PronType=Int|l-obl",
535
+ "513": "PRON|PronType=Int|l-vocative",
536
+ "514": "PRON|PronType=Int|r-ccomp",
537
+ "515": "PRON|PronType=Int|r-conj",
538
+ "516": "PRON|PronType=Int|r-flat",
539
+ "517": "PRON|PronType=Int|r-obj",
540
+ "518": "PRON|PronType=Int|r-parataxis",
541
+ "519": "PRON|PronType=Int|r-xcomp",
542
+ "520": "PRON|PronType=Int|root",
543
+ "521": "PRON|PronType=Prs|Reflex=Yes|_",
544
+ "522": "PRON|PronType=Prs|Reflex=Yes|l-acl",
545
+ "523": "PRON|PronType=Prs|Reflex=Yes|l-det",
546
+ "524": "PRON|PronType=Prs|Reflex=Yes|l-nsubj",
547
+ "525": "PRON|PronType=Prs|Reflex=Yes|l-obj",
548
+ "526": "PRON|PronType=Prs|Reflex=Yes|l-obl",
549
+ "527": "PRON|PronType=Prs|Reflex=Yes|r-dislocated",
550
+ "528": "PRON|PronType=Prs|Reflex=Yes|r-obj",
551
+ "529": "PRON|PronType=Prs|Reflex=Yes|r-obl",
552
+ "530": "PRON|PronType=Prs|Reflex=Yes|root",
553
+ "531": "PRON|PronType=Prs|_",
554
+ "532": "PRON|PronType=Prs|l-det",
555
+ "533": "PRON|PronType=Prs|l-nsubj",
556
+ "534": "PRON|PronType=Prs|l-nsubj:outer",
557
+ "535": "PRON|PronType=Prs|l-obj",
558
+ "536": "PRON|PronType=Prs|r-conj",
559
+ "537": "PRON|PronType=Prs|r-iobj",
560
+ "538": "PRON|PronType=Prs|r-obj",
561
+ "539": "PROPN",
562
+ "540": "PROPN.",
563
+ "541": "PROPN|Case=Loc|NameType=Geo|_",
564
+ "542": "PROPN|Case=Loc|NameType=Geo|l-acl",
565
+ "543": "PROPN|Case=Loc|NameType=Geo|l-advcl",
566
+ "544": "PROPN|Case=Loc|NameType=Geo|l-amod",
567
+ "545": "PROPN|Case=Loc|NameType=Geo|l-compound",
568
+ "546": "PROPN|Case=Loc|NameType=Geo|l-csubj",
569
+ "547": "PROPN|Case=Loc|NameType=Geo|l-dislocated",
570
+ "548": "PROPN|Case=Loc|NameType=Geo|l-nmod",
571
+ "549": "PROPN|Case=Loc|NameType=Geo|l-nsubj",
572
+ "550": "PROPN|Case=Loc|NameType=Geo|l-nsubj:outer",
573
+ "551": "PROPN|Case=Loc|NameType=Geo|l-obl",
574
+ "552": "PROPN|Case=Loc|NameType=Geo|l-obl:lmod",
575
+ "553": "PROPN|Case=Loc|NameType=Geo|r-conj",
576
+ "554": "PROPN|Case=Loc|NameType=Geo|r-flat",
577
+ "555": "PROPN|Case=Loc|NameType=Geo|r-iobj",
578
+ "556": "PROPN|Case=Loc|NameType=Geo|r-obj",
579
+ "557": "PROPN|Case=Loc|NameType=Geo|r-obl",
580
+ "558": "PROPN|Case=Loc|NameType=Geo|r-obl:lmod",
581
+ "559": "PROPN|Case=Loc|NameType=Geo|r-parataxis",
582
+ "560": "PROPN|Case=Loc|NameType=Geo|r-xcomp",
583
+ "561": "PROPN|Case=Loc|NameType=Geo|root",
584
+ "562": "PROPN|Case=Loc|NameType=Nat|_",
585
+ "563": "PROPN|Case=Loc|NameType=Nat|l-acl",
586
+ "564": "PROPN|Case=Loc|NameType=Nat|l-advcl",
587
+ "565": "PROPN|Case=Loc|NameType=Nat|l-amod",
588
+ "566": "PROPN|Case=Loc|NameType=Nat|l-clf",
589
+ "567": "PROPN|Case=Loc|NameType=Nat|l-compound",
590
+ "568": "PROPN|Case=Loc|NameType=Nat|l-nmod",
591
+ "569": "PROPN|Case=Loc|NameType=Nat|l-nsubj",
592
+ "570": "PROPN|Case=Loc|NameType=Nat|l-nsubj:outer",
593
+ "571": "PROPN|Case=Loc|NameType=Nat|l-nsubj:pass",
594
+ "572": "PROPN|Case=Loc|NameType=Nat|l-obj",
595
+ "573": "PROPN|Case=Loc|NameType=Nat|l-obl",
596
+ "574": "PROPN|Case=Loc|NameType=Nat|l-obl:lmod",
597
+ "575": "PROPN|Case=Loc|NameType=Nat|r-ccomp",
598
+ "576": "PROPN|Case=Loc|NameType=Nat|r-conj",
599
+ "577": "PROPN|Case=Loc|NameType=Nat|r-flat",
600
+ "578": "PROPN|Case=Loc|NameType=Nat|r-iobj",
601
+ "579": "PROPN|Case=Loc|NameType=Nat|r-nmod",
602
+ "580": "PROPN|Case=Loc|NameType=Nat|r-obj",
603
+ "581": "PROPN|Case=Loc|NameType=Nat|r-obl",
604
+ "582": "PROPN|Case=Loc|NameType=Nat|r-obl:lmod",
605
+ "583": "PROPN|Case=Loc|NameType=Nat|r-parataxis",
606
+ "584": "PROPN|Case=Loc|NameType=Nat|r-xcomp",
607
+ "585": "PROPN|Case=Loc|NameType=Nat|root",
608
+ "586": "PROPN|NameType=Giv|_",
609
+ "587": "PROPN|NameType=Giv|l-acl",
610
+ "588": "PROPN|NameType=Giv|l-advcl",
611
+ "589": "PROPN|NameType=Giv|l-amod",
612
+ "590": "PROPN|NameType=Giv|l-compound",
613
+ "591": "PROPN|NameType=Giv|l-dislocated",
614
+ "592": "PROPN|NameType=Giv|l-nmod",
615
+ "593": "PROPN|NameType=Giv|l-nsubj",
616
+ "594": "PROPN|NameType=Giv|l-nsubj:outer",
617
+ "595": "PROPN|NameType=Giv|l-nsubj:pass",
618
+ "596": "PROPN|NameType=Giv|l-obj",
619
+ "597": "PROPN|NameType=Giv|l-obl",
620
+ "598": "PROPN|NameType=Giv|l-obl:lmod",
621
+ "599": "PROPN|NameType=Giv|l-parataxis",
622
+ "600": "PROPN|NameType=Giv|l-vocative",
623
+ "601": "PROPN|NameType=Giv|r-appos",
624
+ "602": "PROPN|NameType=Giv|r-ccomp",
625
+ "603": "PROPN|NameType=Giv|r-conj",
626
+ "604": "PROPN|NameType=Giv|r-dislocated",
627
+ "605": "PROPN|NameType=Giv|r-flat",
628
+ "606": "PROPN|NameType=Giv|r-iobj",
629
+ "607": "PROPN|NameType=Giv|r-list",
630
+ "608": "PROPN|NameType=Giv|r-nmod",
631
+ "609": "PROPN|NameType=Giv|r-obj",
632
+ "610": "PROPN|NameType=Giv|r-obl",
633
+ "611": "PROPN|NameType=Giv|r-obl:lmod",
634
+ "612": "PROPN|NameType=Giv|r-parataxis",
635
+ "613": "PROPN|NameType=Giv|r-xcomp",
636
+ "614": "PROPN|NameType=Giv|root",
637
+ "615": "PROPN|NameType=Prs|_",
638
+ "616": "PROPN|NameType=Prs|l-acl",
639
+ "617": "PROPN|NameType=Prs|l-advcl",
640
+ "618": "PROPN|NameType=Prs|l-amod",
641
+ "619": "PROPN|NameType=Prs|l-compound",
642
+ "620": "PROPN|NameType=Prs|l-dislocated",
643
+ "621": "PROPN|NameType=Prs|l-nmod",
644
+ "622": "PROPN|NameType=Prs|l-nsubj",
645
+ "623": "PROPN|NameType=Prs|l-nsubj:outer",
646
+ "624": "PROPN|NameType=Prs|l-obj",
647
+ "625": "PROPN|NameType=Prs|l-obl",
648
+ "626": "PROPN|NameType=Prs|r-conj",
649
+ "627": "PROPN|NameType=Prs|r-dislocated",
650
+ "628": "PROPN|NameType=Prs|r-flat",
651
+ "629": "PROPN|NameType=Prs|r-iobj",
652
+ "630": "PROPN|NameType=Prs|r-obj",
653
+ "631": "PROPN|NameType=Prs|r-obl",
654
+ "632": "PROPN|NameType=Prs|r-parataxis",
655
+ "633": "PROPN|NameType=Prs|root",
656
+ "634": "PROPN|NameType=Sur|_",
657
+ "635": "PROPN|NameType=Sur|l-acl",
658
+ "636": "PROPN|NameType=Sur|l-advcl",
659
+ "637": "PROPN|NameType=Sur|l-amod",
660
+ "638": "PROPN|NameType=Sur|l-compound",
661
+ "639": "PROPN|NameType=Sur|l-csubj",
662
+ "640": "PROPN|NameType=Sur|l-dislocated",
663
+ "641": "PROPN|NameType=Sur|l-nmod",
664
+ "642": "PROPN|NameType=Sur|l-nsubj",
665
+ "643": "PROPN|NameType=Sur|l-nsubj:outer",
666
+ "644": "PROPN|NameType=Sur|l-nsubj:pass",
667
+ "645": "PROPN|NameType=Sur|l-obl",
668
+ "646": "PROPN|NameType=Sur|l-obl:lmod",
669
+ "647": "PROPN|NameType=Sur|l-vocative",
670
+ "648": "PROPN|NameType=Sur|r-ccomp",
671
+ "649": "PROPN|NameType=Sur|r-conj",
672
+ "650": "PROPN|NameType=Sur|r-dislocated",
673
+ "651": "PROPN|NameType=Sur|r-flat",
674
+ "652": "PROPN|NameType=Sur|r-iobj",
675
+ "653": "PROPN|NameType=Sur|r-list",
676
+ "654": "PROPN|NameType=Sur|r-nmod",
677
+ "655": "PROPN|NameType=Sur|r-nsubj",
678
+ "656": "PROPN|NameType=Sur|r-obj",
679
+ "657": "PROPN|NameType=Sur|r-obl",
680
+ "658": "PROPN|NameType=Sur|r-obl:lmod",
681
+ "659": "PROPN|NameType=Sur|r-parataxis",
682
+ "660": "PROPN|NameType=Sur|r-xcomp",
683
+ "661": "PROPN|NameType=Sur|root",
684
+ "662": "PROPN|_",
685
+ "663": "PROPN|l-nmod",
686
+ "664": "PUNCT",
687
+ "665": "PUNCT.",
688
+ "666": "PUNCT|_",
689
+ "667": "PUNCT|root",
690
+ "668": "SCONJ",
691
+ "669": "SCONJ.",
692
+ "670": "SCONJ|_",
693
+ "671": "SCONJ|l-case",
694
+ "672": "SCONJ|l-cc",
695
+ "673": "SCONJ|l-mark",
696
+ "674": "SCONJ|l-nsubj",
697
+ "675": "SCONJ|l-obl",
698
+ "676": "SCONJ|r-case",
699
+ "677": "SCONJ|r-iobj",
700
+ "678": "SCONJ|r-mark",
701
+ "679": "SCONJ|r-nsubj",
702
+ "680": "SCONJ|r-nsubj:pass",
703
+ "681": "SCONJ|r-obj",
704
+ "682": "SCONJ|root",
705
+ "683": "SYM",
706
+ "684": "SYM.",
707
+ "685": "SYM|_",
708
+ "686": "SYM|l-nmod",
709
+ "687": "SYM|l-nsubj",
710
+ "688": "SYM|r-conj",
711
+ "689": "SYM|r-nmod",
712
+ "690": "SYM|r-xcomp",
713
+ "691": "SYM|root",
714
+ "692": "VERB",
715
+ "693": "VERB.",
716
+ "694": "VERB|Degree=Equ|VerbForm=Part|_",
717
+ "695": "VERB|Degree=Equ|VerbForm=Part|l-amod",
718
+ "696": "VERB|Degree=Equ|_",
719
+ "697": "VERB|Degree=Equ|l-acl",
720
+ "698": "VERB|Degree=Equ|l-advcl",
721
+ "699": "VERB|Degree=Equ|l-ccomp",
722
+ "700": "VERB|Degree=Equ|l-csubj",
723
+ "701": "VERB|Degree=Equ|l-nsubj",
724
+ "702": "VERB|Degree=Equ|l-obj",
725
+ "703": "VERB|Degree=Equ|r-ccomp",
726
+ "704": "VERB|Degree=Equ|r-compound:redup",
727
+ "705": "VERB|Degree=Equ|r-conj",
728
+ "706": "VERB|Degree=Equ|r-obj",
729
+ "707": "VERB|Degree=Equ|r-parataxis",
730
+ "708": "VERB|Degree=Equ|r-xcomp",
731
+ "709": "VERB|Degree=Equ|root",
732
+ "710": "VERB|Degree=Pos|VerbForm=Part|_",
733
+ "711": "VERB|Degree=Pos|VerbForm=Part|l-amod",
734
+ "712": "VERB|Degree=Pos|VerbForm=Part|r-amod",
735
+ "713": "VERB|Degree=Pos|_",
736
+ "714": "VERB|Degree=Pos|l-acl",
737
+ "715": "VERB|Degree=Pos|l-advcl",
738
+ "716": "VERB|Degree=Pos|l-ccomp",
739
+ "717": "VERB|Degree=Pos|l-csubj",
740
+ "718": "VERB|Degree=Pos|l-csubj:outer",
741
+ "719": "VERB|Degree=Pos|l-dislocated",
742
+ "720": "VERB|Degree=Pos|l-nsubj",
743
+ "721": "VERB|Degree=Pos|l-nsubj:outer",
744
+ "722": "VERB|Degree=Pos|l-obj",
745
+ "723": "VERB|Degree=Pos|l-obl",
746
+ "724": "VERB|Degree=Pos|l-vocative",
747
+ "725": "VERB|Degree=Pos|r-advcl",
748
+ "726": "VERB|Degree=Pos|r-ccomp",
749
+ "727": "VERB|Degree=Pos|r-compound:redup",
750
+ "728": "VERB|Degree=Pos|r-conj",
751
+ "729": "VERB|Degree=Pos|r-dislocated",
752
+ "730": "VERB|Degree=Pos|r-fixed",
753
+ "731": "VERB|Degree=Pos|r-flat:vv",
754
+ "732": "VERB|Degree=Pos|r-iobj",
755
+ "733": "VERB|Degree=Pos|r-obj",
756
+ "734": "VERB|Degree=Pos|r-obl",
757
+ "735": "VERB|Degree=Pos|r-parataxis",
758
+ "736": "VERB|Degree=Pos|r-xcomp",
759
+ "737": "VERB|Degree=Pos|root",
760
+ "738": "VERB|Polarity=Neg|VerbForm=Part|_",
761
+ "739": "VERB|Polarity=Neg|VerbForm=Part|l-amod",
762
+ "740": "VERB|Polarity=Neg|_",
763
+ "741": "VERB|Polarity=Neg|l-acl",
764
+ "742": "VERB|Polarity=Neg|l-advcl",
765
+ "743": "VERB|Polarity=Neg|l-ccomp",
766
+ "744": "VERB|Polarity=Neg|l-csubj",
767
+ "745": "VERB|Polarity=Neg|l-csubj:outer",
768
+ "746": "VERB|Polarity=Neg|l-nsubj",
769
+ "747": "VERB|Polarity=Neg|l-obl",
770
+ "748": "VERB|Polarity=Neg|r-advcl",
771
+ "749": "VERB|Polarity=Neg|r-ccomp",
772
+ "750": "VERB|Polarity=Neg|r-conj",
773
+ "751": "VERB|Polarity=Neg|r-flat:vv",
774
+ "752": "VERB|Polarity=Neg|r-obj",
775
+ "753": "VERB|Polarity=Neg|r-obl",
776
+ "754": "VERB|Polarity=Neg|r-parataxis",
777
+ "755": "VERB|Polarity=Neg|r-xcomp",
778
+ "756": "VERB|Polarity=Neg|root",
779
+ "757": "VERB|VerbForm=Part|_",
780
+ "758": "VERB|VerbForm=Part|l-amod",
781
+ "759": "VERB|VerbForm=Part|r-amod",
782
+ "760": "VERB|_",
783
+ "761": "VERB|l-acl",
784
+ "762": "VERB|l-advcl",
785
+ "763": "VERB|l-ccomp",
786
+ "764": "VERB|l-csubj",
787
+ "765": "VERB|l-csubj:outer",
788
+ "766": "VERB|l-csubj:pass",
789
+ "767": "VERB|l-dislocated",
790
+ "768": "VERB|l-nsubj",
791
+ "769": "VERB|l-nsubj:outer",
792
+ "770": "VERB|l-obj",
793
+ "771": "VERB|l-obl",
794
+ "772": "VERB|l-obl:lmod",
795
+ "773": "VERB|l-parataxis",
796
+ "774": "VERB|r-acl",
797
+ "775": "VERB|r-advcl",
798
+ "776": "VERB|r-ccomp",
799
+ "777": "VERB|r-compound:redup",
800
+ "778": "VERB|r-conj",
801
+ "779": "VERB|r-dislocated",
802
+ "780": "VERB|r-fixed",
803
+ "781": "VERB|r-flat:vv",
804
+ "782": "VERB|r-iobj",
805
+ "783": "VERB|r-list",
806
+ "784": "VERB|r-obj",
807
+ "785": "VERB|r-obl",
808
+ "786": "VERB|r-obl:lmod",
809
+ "787": "VERB|r-parataxis",
810
+ "788": "VERB|r-vocative",
811
+ "789": "VERB|r-xcomp",
812
+ "790": "VERB|root"
813
+ },
814
+ "initializer_range": 0.02,
815
+ "intermediate_size": 3072,
816
+ "label2id": {
817
+ "ADP": 0,
818
+ "ADP.": 1,
819
+ "ADP|Degree=Equ|_": 2,
820
+ "ADP|Degree=Equ|l-cc": 3,
821
+ "ADP|_": 4,
822
+ "ADP|l-acl": 5,
823
+ "ADP|l-advcl": 6,
824
+ "ADP|l-amod": 7,
825
+ "ADP|l-case": 8,
826
+ "ADP|l-cc": 9,
827
+ "ADP|l-mark": 10,
828
+ "ADP|l-nsubj": 11,
829
+ "ADP|l-obl": 12,
830
+ "ADP|r-case": 13,
831
+ "ADP|r-conj": 14,
832
+ "ADP|r-fixed": 15,
833
+ "ADP|r-mark": 16,
834
+ "ADP|r-obj": 17,
835
+ "ADP|root": 18,
836
+ "ADV": 19,
837
+ "ADV.": 20,
838
+ "ADV|AdvType=Cau|_": 21,
839
+ "ADV|AdvType=Cau|l-advmod": 22,
840
+ "ADV|AdvType=Cau|l-amod": 23,
841
+ "ADV|AdvType=Cau|l-nsubj": 24,
842
+ "ADV|AdvType=Cau|l-obj": 25,
843
+ "ADV|AdvType=Deg|Degree=Cmp|_": 26,
844
+ "ADV|AdvType=Deg|Degree=Cmp|l-advmod": 27,
845
+ "ADV|AdvType=Deg|Degree=Cmp|l-amod": 28,
846
+ "ADV|AdvType=Deg|Degree=Cmp|r-conj": 29,
847
+ "ADV|AdvType=Deg|Degree=Cmp|r-obj": 30,
848
+ "ADV|AdvType=Deg|Degree=Pos|_": 31,
849
+ "ADV|AdvType=Deg|Degree=Pos|l-advmod": 32,
850
+ "ADV|AdvType=Deg|Degree=Pos|l-amod": 33,
851
+ "ADV|AdvType=Deg|Degree=Pos|r-ccomp": 34,
852
+ "ADV|AdvType=Deg|Degree=Pos|r-conj": 35,
853
+ "ADV|AdvType=Deg|Degree=Pos|r-flat:vv": 36,
854
+ "ADV|AdvType=Deg|Degree=Pos|r-parataxis": 37,
855
+ "ADV|AdvType=Deg|Degree=Pos|root": 38,
856
+ "ADV|AdvType=Deg|Degree=Sup|_": 39,
857
+ "ADV|AdvType=Deg|Degree=Sup|l-advmod": 40,
858
+ "ADV|AdvType=Deg|Degree=Sup|l-amod": 41,
859
+ "ADV|AdvType=Deg|Degree=Sup|l-nsubj": 42,
860
+ "ADV|AdvType=Deg|Degree=Sup|r-conj": 43,
861
+ "ADV|AdvType=Deg|Degree=Sup|r-parataxis": 44,
862
+ "ADV|AdvType=Deg|Degree=Sup|root": 45,
863
+ "ADV|AdvType=Tim|Aspect=Perf|_": 46,
864
+ "ADV|AdvType=Tim|Aspect=Perf|l-advmod": 47,
865
+ "ADV|AdvType=Tim|Aspect=Perf|l-amod": 48,
866
+ "ADV|AdvType=Tim|Aspect=Perf|l-obl:lmod": 49,
867
+ "ADV|AdvType=Tim|Aspect=Perf|r-parataxis": 50,
868
+ "ADV|AdvType=Tim|Aspect=Perf|root": 51,
869
+ "ADV|AdvType=Tim|Tense=Fut|_": 52,
870
+ "ADV|AdvType=Tim|Tense=Fut|l-advmod": 53,
871
+ "ADV|AdvType=Tim|Tense=Fut|l-amod": 54,
872
+ "ADV|AdvType=Tim|Tense=Fut|l-nsubj": 55,
873
+ "ADV|AdvType=Tim|Tense=Fut|l-nsubj:outer": 56,
874
+ "ADV|AdvType=Tim|Tense=Fut|root": 57,
875
+ "ADV|AdvType=Tim|Tense=Past|_": 58,
876
+ "ADV|AdvType=Tim|Tense=Past|l-advmod": 59,
877
+ "ADV|AdvType=Tim|Tense=Past|l-amod": 60,
878
+ "ADV|AdvType=Tim|Tense=Pres|_": 61,
879
+ "ADV|AdvType=Tim|Tense=Pres|l-advmod": 62,
880
+ "ADV|AdvType=Tim|Tense=Pres|l-amod": 63,
881
+ "ADV|AdvType=Tim|Tense=Pres|root": 64,
882
+ "ADV|AdvType=Tim|_": 65,
883
+ "ADV|AdvType=Tim|l-advcl": 66,
884
+ "ADV|AdvType=Tim|l-advmod": 67,
885
+ "ADV|AdvType=Tim|l-amod": 68,
886
+ "ADV|AdvType=Tim|l-nsubj": 69,
887
+ "ADV|AdvType=Tim|r-advmod": 70,
888
+ "ADV|AdvType=Tim|r-ccomp": 71,
889
+ "ADV|AdvType=Tim|r-compound:redup": 72,
890
+ "ADV|AdvType=Tim|r-conj": 73,
891
+ "ADV|AdvType=Tim|r-flat:vv": 74,
892
+ "ADV|AdvType=Tim|r-parataxis": 75,
893
+ "ADV|AdvType=Tim|root": 76,
894
+ "ADV|Degree=Equ|VerbForm=Conv|_": 77,
895
+ "ADV|Degree=Equ|VerbForm=Conv|l-advmod": 78,
896
+ "ADV|Degree=Pos|VerbForm=Conv|_": 79,
897
+ "ADV|Degree=Pos|VerbForm=Conv|l-advmod": 80,
898
+ "ADV|Degree=Pos|VerbForm=Conv|r-advmod": 81,
899
+ "ADV|Polarity=Neg|VerbForm=Conv|_": 82,
900
+ "ADV|Polarity=Neg|VerbForm=Conv|l-advmod": 83,
901
+ "ADV|Polarity=Neg|_": 84,
902
+ "ADV|Polarity=Neg|l-advmod": 85,
903
+ "ADV|Polarity=Neg|l-amod": 86,
904
+ "ADV|Polarity=Neg|l-nsubj": 87,
905
+ "ADV|Polarity=Neg|l-parataxis": 88,
906
+ "ADV|Polarity=Neg|r-advmod": 89,
907
+ "ADV|Polarity=Neg|r-conj": 90,
908
+ "ADV|Polarity=Neg|r-obj": 91,
909
+ "ADV|Polarity=Neg|r-parataxis": 92,
910
+ "ADV|Polarity=Neg|root": 93,
911
+ "ADV|VerbForm=Conv|_": 94,
912
+ "ADV|VerbForm=Conv|l-advmod": 95,
913
+ "ADV|VerbForm=Conv|r-advmod": 96,
914
+ "ADV|_": 97,
915
+ "ADV|l-acl": 98,
916
+ "ADV|l-advcl": 99,
917
+ "ADV|l-advmod": 100,
918
+ "ADV|l-amod": 101,
919
+ "ADV|l-cc": 102,
920
+ "ADV|l-nsubj": 103,
921
+ "ADV|r-advmod": 104,
922
+ "ADV|r-ccomp": 105,
923
+ "ADV|r-conj": 106,
924
+ "ADV|r-flat:vv": 107,
925
+ "ADV|r-obj": 108,
926
+ "ADV|root": 109,
927
+ "AUX": 110,
928
+ "AUX.": 111,
929
+ "AUX|Mood=Des|_": 112,
930
+ "AUX|Mood=Des|l-aux": 113,
931
+ "AUX|Mood=Des|l-csubj": 114,
932
+ "AUX|Mood=Des|l-parataxis": 115,
933
+ "AUX|Mood=Des|r-ccomp": 116,
934
+ "AUX|Mood=Des|r-conj": 117,
935
+ "AUX|Mood=Des|r-flat:vv": 118,
936
+ "AUX|Mood=Des|root": 119,
937
+ "AUX|Mood=Nec|_": 120,
938
+ "AUX|Mood=Nec|l-acl": 121,
939
+ "AUX|Mood=Nec|l-amod": 122,
940
+ "AUX|Mood=Nec|l-aux": 123,
941
+ "AUX|Mood=Nec|r-aux": 124,
942
+ "AUX|Mood=Nec|root": 125,
943
+ "AUX|Mood=Pot|_": 126,
944
+ "AUX|Mood=Pot|l-acl": 127,
945
+ "AUX|Mood=Pot|l-advcl": 128,
946
+ "AUX|Mood=Pot|l-amod": 129,
947
+ "AUX|Mood=Pot|l-aux": 130,
948
+ "AUX|Mood=Pot|l-csubj": 131,
949
+ "AUX|Mood=Pot|l-nsubj": 132,
950
+ "AUX|Mood=Pot|r-ccomp": 133,
951
+ "AUX|Mood=Pot|r-conj": 134,
952
+ "AUX|Mood=Pot|r-obj": 135,
953
+ "AUX|Mood=Pot|r-parataxis": 136,
954
+ "AUX|Mood=Pot|r-xcomp": 137,
955
+ "AUX|Mood=Pot|root": 138,
956
+ "AUX|VerbType=Cop|_": 139,
957
+ "AUX|VerbType=Cop|l-cop": 140,
958
+ "AUX|Voice=Pass|_": 141,
959
+ "AUX|Voice=Pass|l-aux": 142,
960
+ "AUX|Voice=Pass|r-conj": 143,
961
+ "AUX|Voice=Pass|root": 144,
962
+ "B-ADP": 145,
963
+ "B-ADP.": 146,
964
+ "B-ADV": 147,
965
+ "B-ADV.": 148,
966
+ "B-AUX": 149,
967
+ "B-AUX.": 150,
968
+ "B-CCONJ": 151,
969
+ "B-CCONJ.": 152,
970
+ "B-INTJ": 153,
971
+ "B-INTJ.": 154,
972
+ "B-NOUN": 155,
973
+ "B-NOUN.": 156,
974
+ "B-NUM": 157,
975
+ "B-NUM.": 158,
976
+ "B-PART": 159,
977
+ "B-PART.": 160,
978
+ "B-PRON": 161,
979
+ "B-PRON.": 162,
980
+ "B-PROPN": 163,
981
+ "B-PROPN.": 164,
982
+ "B-PUNCT": 165,
983
+ "B-PUNCT.": 166,
984
+ "B-SCONJ": 167,
985
+ "B-SCONJ.": 168,
986
+ "B-SYM": 169,
987
+ "B-SYM.": 170,
988
+ "B-VERB": 171,
989
+ "B-VERB.": 172,
990
+ "CCONJ": 173,
991
+ "CCONJ.": 174,
992
+ "CCONJ|_": 175,
993
+ "CCONJ|l-advmod": 176,
994
+ "CCONJ|l-amod": 177,
995
+ "CCONJ|l-cc": 178,
996
+ "CCONJ|l-obj": 179,
997
+ "CCONJ|r-fixed": 180,
998
+ "CCONJ|r-orphan": 181,
999
+ "I-ADP": 182,
1000
+ "I-ADP.": 183,
1001
+ "I-ADV": 184,
1002
+ "I-ADV.": 185,
1003
+ "I-AUX": 186,
1004
+ "I-AUX.": 187,
1005
+ "I-CCONJ": 188,
1006
+ "I-CCONJ.": 189,
1007
+ "I-INTJ": 190,
1008
+ "I-INTJ.": 191,
1009
+ "I-NOUN": 192,
1010
+ "I-NOUN.": 193,
1011
+ "I-NUM": 194,
1012
+ "I-NUM.": 195,
1013
+ "I-PART": 196,
1014
+ "I-PART.": 197,
1015
+ "I-PRON": 198,
1016
+ "I-PRON.": 199,
1017
+ "I-PROPN": 200,
1018
+ "I-PROPN.": 201,
1019
+ "I-PUNCT": 202,
1020
+ "I-PUNCT.": 203,
1021
+ "I-SCONJ": 204,
1022
+ "I-SCONJ.": 205,
1023
+ "I-SYM": 206,
1024
+ "I-SYM.": 207,
1025
+ "I-VERB": 208,
1026
+ "I-VERB.": 209,
1027
+ "INTJ": 210,
1028
+ "INTJ.": 211,
1029
+ "INTJ|_": 212,
1030
+ "INTJ|l-advcl": 213,
1031
+ "INTJ|l-csubj": 214,
1032
+ "INTJ|l-discourse": 215,
1033
+ "INTJ|l-discourse:sp": 216,
1034
+ "INTJ|l-dislocated": 217,
1035
+ "INTJ|l-nsubj": 218,
1036
+ "INTJ|l-vocative": 219,
1037
+ "INTJ|r-compound:redup": 220,
1038
+ "INTJ|r-conj": 221,
1039
+ "INTJ|r-discourse:sp": 222,
1040
+ "INTJ|r-dislocated": 223,
1041
+ "INTJ|r-fixed": 224,
1042
+ "INTJ|r-obj": 225,
1043
+ "INTJ|r-parataxis": 226,
1044
+ "INTJ|root": 227,
1045
+ "NOUN": 228,
1046
+ "NOUN.": 229,
1047
+ "NOUN|Case=Loc|_": 230,
1048
+ "NOUN|Case=Loc|l-acl": 231,
1049
+ "NOUN|Case=Loc|l-advcl": 232,
1050
+ "NOUN|Case=Loc|l-amod": 233,
1051
+ "NOUN|Case=Loc|l-clf": 234,
1052
+ "NOUN|Case=Loc|l-compound": 235,
1053
+ "NOUN|Case=Loc|l-csubj": 236,
1054
+ "NOUN|Case=Loc|l-dislocated": 237,
1055
+ "NOUN|Case=Loc|l-nmod": 238,
1056
+ "NOUN|Case=Loc|l-nsubj": 239,
1057
+ "NOUN|Case=Loc|l-nsubj:outer": 240,
1058
+ "NOUN|Case=Loc|l-obj": 241,
1059
+ "NOUN|Case=Loc|l-obl": 242,
1060
+ "NOUN|Case=Loc|l-obl:lmod": 243,
1061
+ "NOUN|Case=Loc|l-obl:tmod": 244,
1062
+ "NOUN|Case=Loc|l-parataxis": 245,
1063
+ "NOUN|Case=Loc|r-ccomp": 246,
1064
+ "NOUN|Case=Loc|r-clf": 247,
1065
+ "NOUN|Case=Loc|r-compound:redup": 248,
1066
+ "NOUN|Case=Loc|r-conj": 249,
1067
+ "NOUN|Case=Loc|r-dislocated": 250,
1068
+ "NOUN|Case=Loc|r-flat": 251,
1069
+ "NOUN|Case=Loc|r-iobj": 252,
1070
+ "NOUN|Case=Loc|r-list": 253,
1071
+ "NOUN|Case=Loc|r-nmod": 254,
1072
+ "NOUN|Case=Loc|r-nsubj": 255,
1073
+ "NOUN|Case=Loc|r-obj": 256,
1074
+ "NOUN|Case=Loc|r-obl": 257,
1075
+ "NOUN|Case=Loc|r-obl:lmod": 258,
1076
+ "NOUN|Case=Loc|r-parataxis": 259,
1077
+ "NOUN|Case=Loc|r-xcomp": 260,
1078
+ "NOUN|Case=Loc|root": 261,
1079
+ "NOUN|Case=Tem|_": 262,
1080
+ "NOUN|Case=Tem|l-acl": 263,
1081
+ "NOUN|Case=Tem|l-advcl": 264,
1082
+ "NOUN|Case=Tem|l-amod": 265,
1083
+ "NOUN|Case=Tem|l-compound": 266,
1084
+ "NOUN|Case=Tem|l-csubj": 267,
1085
+ "NOUN|Case=Tem|l-nmod": 268,
1086
+ "NOUN|Case=Tem|l-nsubj": 269,
1087
+ "NOUN|Case=Tem|l-nsubj:outer": 270,
1088
+ "NOUN|Case=Tem|l-obj": 271,
1089
+ "NOUN|Case=Tem|l-obl:tmod": 272,
1090
+ "NOUN|Case=Tem|r-amod": 273,
1091
+ "NOUN|Case=Tem|r-ccomp": 274,
1092
+ "NOUN|Case=Tem|r-clf": 275,
1093
+ "NOUN|Case=Tem|r-compound:redup": 276,
1094
+ "NOUN|Case=Tem|r-conj": 277,
1095
+ "NOUN|Case=Tem|r-flat": 278,
1096
+ "NOUN|Case=Tem|r-iobj": 279,
1097
+ "NOUN|Case=Tem|r-list": 280,
1098
+ "NOUN|Case=Tem|r-nsubj": 281,
1099
+ "NOUN|Case=Tem|r-obj": 282,
1100
+ "NOUN|Case=Tem|r-obl:tmod": 283,
1101
+ "NOUN|Case=Tem|r-parataxis": 284,
1102
+ "NOUN|Case=Tem|r-xcomp": 285,
1103
+ "NOUN|Case=Tem|root": 286,
1104
+ "NOUN|Degree=Pos|_": 287,
1105
+ "NOUN|Degree=Pos|root": 288,
1106
+ "NOUN|NounType=Clf|_": 289,
1107
+ "NOUN|NounType=Clf|l-clf": 290,
1108
+ "NOUN|NounType=Clf|l-nmod": 291,
1109
+ "NOUN|NounType=Clf|l-nsubj": 292,
1110
+ "NOUN|NounType=Clf|l-obl": 293,
1111
+ "NOUN|NounType=Clf|r-ccomp": 294,
1112
+ "NOUN|NounType=Clf|r-clf": 295,
1113
+ "NOUN|NounType=Clf|r-compound:redup": 296,
1114
+ "NOUN|NounType=Clf|r-conj": 297,
1115
+ "NOUN|NounType=Clf|r-flat": 298,
1116
+ "NOUN|NounType=Clf|r-obj": 299,
1117
+ "NOUN|NounType=Clf|r-parataxis": 300,
1118
+ "NOUN|NounType=Clf|root": 301,
1119
+ "NOUN|_": 302,
1120
+ "NOUN|l-acl": 303,
1121
+ "NOUN|l-advcl": 304,
1122
+ "NOUN|l-amod": 305,
1123
+ "NOUN|l-ccomp": 306,
1124
+ "NOUN|l-clf": 307,
1125
+ "NOUN|l-compound": 308,
1126
+ "NOUN|l-csubj": 309,
1127
+ "NOUN|l-csubj:outer": 310,
1128
+ "NOUN|l-dislocated": 311,
1129
+ "NOUN|l-iobj": 312,
1130
+ "NOUN|l-list": 313,
1131
+ "NOUN|l-nmod": 314,
1132
+ "NOUN|l-nsubj": 315,
1133
+ "NOUN|l-nsubj:outer": 316,
1134
+ "NOUN|l-nsubj:pass": 317,
1135
+ "NOUN|l-obj": 318,
1136
+ "NOUN|l-obl": 319,
1137
+ "NOUN|l-obl:lmod": 320,
1138
+ "NOUN|l-obl:tmod": 321,
1139
+ "NOUN|l-vocative": 322,
1140
+ "NOUN|r-acl": 323,
1141
+ "NOUN|r-advcl": 324,
1142
+ "NOUN|r-amod": 325,
1143
+ "NOUN|r-ccomp": 326,
1144
+ "NOUN|r-clf": 327,
1145
+ "NOUN|r-compound:redup": 328,
1146
+ "NOUN|r-conj": 329,
1147
+ "NOUN|r-csubj": 330,
1148
+ "NOUN|r-dislocated": 331,
1149
+ "NOUN|r-flat": 332,
1150
+ "NOUN|r-flat:foreign": 333,
1151
+ "NOUN|r-iobj": 334,
1152
+ "NOUN|r-list": 335,
1153
+ "NOUN|r-nmod": 336,
1154
+ "NOUN|r-nsubj": 337,
1155
+ "NOUN|r-obj": 338,
1156
+ "NOUN|r-obl": 339,
1157
+ "NOUN|r-obl:lmod": 340,
1158
+ "NOUN|r-parataxis": 341,
1159
+ "NOUN|r-vocative": 342,
1160
+ "NOUN|r-xcomp": 343,
1161
+ "NOUN|root": 344,
1162
+ "NUM": 345,
1163
+ "NUM.": 346,
1164
+ "NUM|NumType=Ord|_": 347,
1165
+ "NUM|NumType=Ord|l-nsubj": 348,
1166
+ "NUM|NumType=Ord|l-nummod": 349,
1167
+ "NUM|NumType=Ord|l-obl": 350,
1168
+ "NUM|NumType=Ord|l-obl:lmod": 351,
1169
+ "NUM|NumType=Ord|l-obl:tmod": 352,
1170
+ "NUM|NumType=Ord|r-conj": 353,
1171
+ "NUM|NumType=Ord|r-flat": 354,
1172
+ "NUM|NumType=Ord|r-obj": 355,
1173
+ "NUM|NumType=Ord|root": 356,
1174
+ "NUM|_": 357,
1175
+ "NUM|l-acl": 358,
1176
+ "NUM|l-advcl": 359,
1177
+ "NUM|l-compound": 360,
1178
+ "NUM|l-csubj": 361,
1179
+ "NUM|l-dislocated": 362,
1180
+ "NUM|l-nsubj": 363,
1181
+ "NUM|l-nsubj:outer": 364,
1182
+ "NUM|l-nummod": 365,
1183
+ "NUM|l-obj": 366,
1184
+ "NUM|l-obl": 367,
1185
+ "NUM|l-obl:lmod": 368,
1186
+ "NUM|l-obl:tmod": 369,
1187
+ "NUM|r-ccomp": 370,
1188
+ "NUM|r-clf": 371,
1189
+ "NUM|r-compound": 372,
1190
+ "NUM|r-compound:redup": 373,
1191
+ "NUM|r-conj": 374,
1192
+ "NUM|r-flat": 375,
1193
+ "NUM|r-iobj": 376,
1194
+ "NUM|r-list": 377,
1195
+ "NUM|r-nummod": 378,
1196
+ "NUM|r-obj": 379,
1197
+ "NUM|r-obl": 380,
1198
+ "NUM|r-obl:tmod": 381,
1199
+ "NUM|r-parataxis": 382,
1200
+ "NUM|r-xcomp": 383,
1201
+ "NUM|root": 384,
1202
+ "PART": 385,
1203
+ "PART.": 386,
1204
+ "PART|_": 387,
1205
+ "PART|l-acl": 388,
1206
+ "PART|l-advcl": 389,
1207
+ "PART|l-advmod": 390,
1208
+ "PART|l-amod": 391,
1209
+ "PART|l-case": 392,
1210
+ "PART|l-cc": 393,
1211
+ "PART|l-csubj": 394,
1212
+ "PART|l-csubj:outer": 395,
1213
+ "PART|l-discourse": 396,
1214
+ "PART|l-discourse:sp": 397,
1215
+ "PART|l-dislocated": 398,
1216
+ "PART|l-mark": 399,
1217
+ "PART|l-nmod": 400,
1218
+ "PART|l-nsubj": 401,
1219
+ "PART|l-nsubj:outer": 402,
1220
+ "PART|l-nsubj:pass": 403,
1221
+ "PART|l-obj": 404,
1222
+ "PART|l-obl": 405,
1223
+ "PART|l-obl:lmod": 406,
1224
+ "PART|r-advmod": 407,
1225
+ "PART|r-case": 408,
1226
+ "PART|r-ccomp": 409,
1227
+ "PART|r-clf": 410,
1228
+ "PART|r-conj": 411,
1229
+ "PART|r-discourse": 412,
1230
+ "PART|r-discourse:sp": 413,
1231
+ "PART|r-dislocated": 414,
1232
+ "PART|r-fixed": 415,
1233
+ "PART|r-flat": 416,
1234
+ "PART|r-iobj": 417,
1235
+ "PART|r-list": 418,
1236
+ "PART|r-mark": 419,
1237
+ "PART|r-nsubj": 420,
1238
+ "PART|r-obj": 421,
1239
+ "PART|r-obl": 422,
1240
+ "PART|r-parataxis": 423,
1241
+ "PART|r-xcomp": 424,
1242
+ "PART|root": 425,
1243
+ "PRON": 426,
1244
+ "PRON.": 427,
1245
+ "PRON|Person=1|PronType=Prs|_": 428,
1246
+ "PRON|Person=1|PronType=Prs|l-acl": 429,
1247
+ "PRON|Person=1|PronType=Prs|l-advcl": 430,
1248
+ "PRON|Person=1|PronType=Prs|l-det": 431,
1249
+ "PRON|Person=1|PronType=Prs|l-iobj": 432,
1250
+ "PRON|Person=1|PronType=Prs|l-nsubj": 433,
1251
+ "PRON|Person=1|PronType=Prs|l-nsubj:outer": 434,
1252
+ "PRON|Person=1|PronType=Prs|l-obj": 435,
1253
+ "PRON|Person=1|PronType=Prs|l-obl": 436,
1254
+ "PRON|Person=1|PronType=Prs|l-vocative": 437,
1255
+ "PRON|Person=1|PronType=Prs|r-ccomp": 438,
1256
+ "PRON|Person=1|PronType=Prs|r-conj": 439,
1257
+ "PRON|Person=1|PronType=Prs|r-iobj": 440,
1258
+ "PRON|Person=1|PronType=Prs|r-nsubj": 441,
1259
+ "PRON|Person=1|PronType=Prs|r-obj": 442,
1260
+ "PRON|Person=1|PronType=Prs|r-obl": 443,
1261
+ "PRON|Person=1|PronType=Prs|r-obl:lmod": 444,
1262
+ "PRON|Person=1|PronType=Prs|root": 445,
1263
+ "PRON|Person=2|PronType=Prs|_": 446,
1264
+ "PRON|Person=2|PronType=Prs|l-advcl": 447,
1265
+ "PRON|Person=2|PronType=Prs|l-amod": 448,
1266
+ "PRON|Person=2|PronType=Prs|l-det": 449,
1267
+ "PRON|Person=2|PronType=Prs|l-nmod": 450,
1268
+ "PRON|Person=2|PronType=Prs|l-nsubj": 451,
1269
+ "PRON|Person=2|PronType=Prs|l-nsubj:outer": 452,
1270
+ "PRON|Person=2|PronType=Prs|l-obj": 453,
1271
+ "PRON|Person=2|PronType=Prs|l-obl": 454,
1272
+ "PRON|Person=2|PronType=Prs|l-vocative": 455,
1273
+ "PRON|Person=2|PronType=Prs|r-conj": 456,
1274
+ "PRON|Person=2|PronType=Prs|r-flat": 457,
1275
+ "PRON|Person=2|PronType=Prs|r-iobj": 458,
1276
+ "PRON|Person=2|PronType=Prs|r-obj": 459,
1277
+ "PRON|Person=2|PronType=Prs|r-obl": 460,
1278
+ "PRON|Person=2|PronType=Prs|root": 461,
1279
+ "PRON|Person=3|PronType=Prs|_": 462,
1280
+ "PRON|Person=3|PronType=Prs|l-advcl": 463,
1281
+ "PRON|Person=3|PronType=Prs|l-amod": 464,
1282
+ "PRON|Person=3|PronType=Prs|l-det": 465,
1283
+ "PRON|Person=3|PronType=Prs|l-dislocated": 466,
1284
+ "PRON|Person=3|PronType=Prs|l-expl": 467,
1285
+ "PRON|Person=3|PronType=Prs|l-iobj": 468,
1286
+ "PRON|Person=3|PronType=Prs|l-nsubj": 469,
1287
+ "PRON|Person=3|PronType=Prs|l-nsubj:outer": 470,
1288
+ "PRON|Person=3|PronType=Prs|l-nsubj:pass": 471,
1289
+ "PRON|Person=3|PronType=Prs|l-obj": 472,
1290
+ "PRON|Person=3|PronType=Prs|l-obl": 473,
1291
+ "PRON|Person=3|PronType=Prs|r-ccomp": 474,
1292
+ "PRON|Person=3|PronType=Prs|r-conj": 475,
1293
+ "PRON|Person=3|PronType=Prs|r-expl": 476,
1294
+ "PRON|Person=3|PronType=Prs|r-iobj": 477,
1295
+ "PRON|Person=3|PronType=Prs|r-nsubj": 478,
1296
+ "PRON|Person=3|PronType=Prs|r-obj": 479,
1297
+ "PRON|Person=3|PronType=Prs|r-obl": 480,
1298
+ "PRON|Person=3|PronType=Prs|root": 481,
1299
+ "PRON|PronType=Dem|_": 482,
1300
+ "PRON|PronType=Dem|l-acl": 483,
1301
+ "PRON|PronType=Dem|l-advcl": 484,
1302
+ "PRON|PronType=Dem|l-amod": 485,
1303
+ "PRON|PronType=Dem|l-compound": 486,
1304
+ "PRON|PronType=Dem|l-det": 487,
1305
+ "PRON|PronType=Dem|l-dislocated": 488,
1306
+ "PRON|PronType=Dem|l-expl": 489,
1307
+ "PRON|PronType=Dem|l-nsubj": 490,
1308
+ "PRON|PronType=Dem|l-nsubj:outer": 491,
1309
+ "PRON|PronType=Dem|l-obj": 492,
1310
+ "PRON|PronType=Dem|l-obl": 493,
1311
+ "PRON|PronType=Dem|l-obl:lmod": 494,
1312
+ "PRON|PronType=Dem|r-conj": 495,
1313
+ "PRON|PronType=Dem|r-det": 496,
1314
+ "PRON|PronType=Dem|r-expl": 497,
1315
+ "PRON|PronType=Dem|r-flat": 498,
1316
+ "PRON|PronType=Dem|r-iobj": 499,
1317
+ "PRON|PronType=Dem|r-obj": 500,
1318
+ "PRON|PronType=Dem|r-obl": 501,
1319
+ "PRON|PronType=Dem|r-obl:lmod": 502,
1320
+ "PRON|PronType=Dem|root": 503,
1321
+ "PRON|PronType=Int|_": 504,
1322
+ "PRON|PronType=Int|l-advcl": 505,
1323
+ "PRON|PronType=Int|l-amod": 506,
1324
+ "PRON|PronType=Int|l-det": 507,
1325
+ "PRON|PronType=Int|l-dislocated": 508,
1326
+ "PRON|PronType=Int|l-nsubj": 509,
1327
+ "PRON|PronType=Int|l-nsubj:outer": 510,
1328
+ "PRON|PronType=Int|l-obj": 511,
1329
+ "PRON|PronType=Int|l-obl": 512,
1330
+ "PRON|PronType=Int|l-vocative": 513,
1331
+ "PRON|PronType=Int|r-ccomp": 514,
1332
+ "PRON|PronType=Int|r-conj": 515,
1333
+ "PRON|PronType=Int|r-flat": 516,
1334
+ "PRON|PronType=Int|r-obj": 517,
1335
+ "PRON|PronType=Int|r-parataxis": 518,
1336
+ "PRON|PronType=Int|r-xcomp": 519,
1337
+ "PRON|PronType=Int|root": 520,
1338
+ "PRON|PronType=Prs|Reflex=Yes|_": 521,
1339
+ "PRON|PronType=Prs|Reflex=Yes|l-acl": 522,
1340
+ "PRON|PronType=Prs|Reflex=Yes|l-det": 523,
1341
+ "PRON|PronType=Prs|Reflex=Yes|l-nsubj": 524,
1342
+ "PRON|PronType=Prs|Reflex=Yes|l-obj": 525,
1343
+ "PRON|PronType=Prs|Reflex=Yes|l-obl": 526,
1344
+ "PRON|PronType=Prs|Reflex=Yes|r-dislocated": 527,
1345
+ "PRON|PronType=Prs|Reflex=Yes|r-obj": 528,
1346
+ "PRON|PronType=Prs|Reflex=Yes|r-obl": 529,
1347
+ "PRON|PronType=Prs|Reflex=Yes|root": 530,
1348
+ "PRON|PronType=Prs|_": 531,
1349
+ "PRON|PronType=Prs|l-det": 532,
1350
+ "PRON|PronType=Prs|l-nsubj": 533,
1351
+ "PRON|PronType=Prs|l-nsubj:outer": 534,
1352
+ "PRON|PronType=Prs|l-obj": 535,
1353
+ "PRON|PronType=Prs|r-conj": 536,
1354
+ "PRON|PronType=Prs|r-iobj": 537,
1355
+ "PRON|PronType=Prs|r-obj": 538,
1356
+ "PROPN": 539,
1357
+ "PROPN.": 540,
1358
+ "PROPN|Case=Loc|NameType=Geo|_": 541,
1359
+ "PROPN|Case=Loc|NameType=Geo|l-acl": 542,
1360
+ "PROPN|Case=Loc|NameType=Geo|l-advcl": 543,
1361
+ "PROPN|Case=Loc|NameType=Geo|l-amod": 544,
1362
+ "PROPN|Case=Loc|NameType=Geo|l-compound": 545,
1363
+ "PROPN|Case=Loc|NameType=Geo|l-csubj": 546,
1364
+ "PROPN|Case=Loc|NameType=Geo|l-dislocated": 547,
1365
+ "PROPN|Case=Loc|NameType=Geo|l-nmod": 548,
1366
+ "PROPN|Case=Loc|NameType=Geo|l-nsubj": 549,
1367
+ "PROPN|Case=Loc|NameType=Geo|l-nsubj:outer": 550,
1368
+ "PROPN|Case=Loc|NameType=Geo|l-obl": 551,
1369
+ "PROPN|Case=Loc|NameType=Geo|l-obl:lmod": 552,
1370
+ "PROPN|Case=Loc|NameType=Geo|r-conj": 553,
1371
+ "PROPN|Case=Loc|NameType=Geo|r-flat": 554,
1372
+ "PROPN|Case=Loc|NameType=Geo|r-iobj": 555,
1373
+ "PROPN|Case=Loc|NameType=Geo|r-obj": 556,
1374
+ "PROPN|Case=Loc|NameType=Geo|r-obl": 557,
1375
+ "PROPN|Case=Loc|NameType=Geo|r-obl:lmod": 558,
1376
+ "PROPN|Case=Loc|NameType=Geo|r-parataxis": 559,
1377
+ "PROPN|Case=Loc|NameType=Geo|r-xcomp": 560,
1378
+ "PROPN|Case=Loc|NameType=Geo|root": 561,
1379
+ "PROPN|Case=Loc|NameType=Nat|_": 562,
1380
+ "PROPN|Case=Loc|NameType=Nat|l-acl": 563,
1381
+ "PROPN|Case=Loc|NameType=Nat|l-advcl": 564,
1382
+ "PROPN|Case=Loc|NameType=Nat|l-amod": 565,
1383
+ "PROPN|Case=Loc|NameType=Nat|l-clf": 566,
1384
+ "PROPN|Case=Loc|NameType=Nat|l-compound": 567,
1385
+ "PROPN|Case=Loc|NameType=Nat|l-nmod": 568,
1386
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj": 569,
1387
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj:outer": 570,
1388
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj:pass": 571,
1389
+ "PROPN|Case=Loc|NameType=Nat|l-obj": 572,
1390
+ "PROPN|Case=Loc|NameType=Nat|l-obl": 573,
1391
+ "PROPN|Case=Loc|NameType=Nat|l-obl:lmod": 574,
1392
+ "PROPN|Case=Loc|NameType=Nat|r-ccomp": 575,
1393
+ "PROPN|Case=Loc|NameType=Nat|r-conj": 576,
1394
+ "PROPN|Case=Loc|NameType=Nat|r-flat": 577,
1395
+ "PROPN|Case=Loc|NameType=Nat|r-iobj": 578,
1396
+ "PROPN|Case=Loc|NameType=Nat|r-nmod": 579,
1397
+ "PROPN|Case=Loc|NameType=Nat|r-obj": 580,
1398
+ "PROPN|Case=Loc|NameType=Nat|r-obl": 581,
1399
+ "PROPN|Case=Loc|NameType=Nat|r-obl:lmod": 582,
1400
+ "PROPN|Case=Loc|NameType=Nat|r-parataxis": 583,
1401
+ "PROPN|Case=Loc|NameType=Nat|r-xcomp": 584,
1402
+ "PROPN|Case=Loc|NameType=Nat|root": 585,
1403
+ "PROPN|NameType=Giv|_": 586,
1404
+ "PROPN|NameType=Giv|l-acl": 587,
1405
+ "PROPN|NameType=Giv|l-advcl": 588,
1406
+ "PROPN|NameType=Giv|l-amod": 589,
1407
+ "PROPN|NameType=Giv|l-compound": 590,
1408
+ "PROPN|NameType=Giv|l-dislocated": 591,
1409
+ "PROPN|NameType=Giv|l-nmod": 592,
1410
+ "PROPN|NameType=Giv|l-nsubj": 593,
1411
+ "PROPN|NameType=Giv|l-nsubj:outer": 594,
1412
+ "PROPN|NameType=Giv|l-nsubj:pass": 595,
1413
+ "PROPN|NameType=Giv|l-obj": 596,
1414
+ "PROPN|NameType=Giv|l-obl": 597,
1415
+ "PROPN|NameType=Giv|l-obl:lmod": 598,
1416
+ "PROPN|NameType=Giv|l-parataxis": 599,
1417
+ "PROPN|NameType=Giv|l-vocative": 600,
1418
+ "PROPN|NameType=Giv|r-appos": 601,
1419
+ "PROPN|NameType=Giv|r-ccomp": 602,
1420
+ "PROPN|NameType=Giv|r-conj": 603,
1421
+ "PROPN|NameType=Giv|r-dislocated": 604,
1422
+ "PROPN|NameType=Giv|r-flat": 605,
1423
+ "PROPN|NameType=Giv|r-iobj": 606,
1424
+ "PROPN|NameType=Giv|r-list": 607,
1425
+ "PROPN|NameType=Giv|r-nmod": 608,
1426
+ "PROPN|NameType=Giv|r-obj": 609,
1427
+ "PROPN|NameType=Giv|r-obl": 610,
1428
+ "PROPN|NameType=Giv|r-obl:lmod": 611,
1429
+ "PROPN|NameType=Giv|r-parataxis": 612,
1430
+ "PROPN|NameType=Giv|r-xcomp": 613,
1431
+ "PROPN|NameType=Giv|root": 614,
1432
+ "PROPN|NameType=Prs|_": 615,
1433
+ "PROPN|NameType=Prs|l-acl": 616,
1434
+ "PROPN|NameType=Prs|l-advcl": 617,
1435
+ "PROPN|NameType=Prs|l-amod": 618,
1436
+ "PROPN|NameType=Prs|l-compound": 619,
1437
+ "PROPN|NameType=Prs|l-dislocated": 620,
1438
+ "PROPN|NameType=Prs|l-nmod": 621,
1439
+ "PROPN|NameType=Prs|l-nsubj": 622,
1440
+ "PROPN|NameType=Prs|l-nsubj:outer": 623,
1441
+ "PROPN|NameType=Prs|l-obj": 624,
1442
+ "PROPN|NameType=Prs|l-obl": 625,
1443
+ "PROPN|NameType=Prs|r-conj": 626,
1444
+ "PROPN|NameType=Prs|r-dislocated": 627,
1445
+ "PROPN|NameType=Prs|r-flat": 628,
1446
+ "PROPN|NameType=Prs|r-iobj": 629,
1447
+ "PROPN|NameType=Prs|r-obj": 630,
1448
+ "PROPN|NameType=Prs|r-obl": 631,
1449
+ "PROPN|NameType=Prs|r-parataxis": 632,
1450
+ "PROPN|NameType=Prs|root": 633,
1451
+ "PROPN|NameType=Sur|_": 634,
1452
+ "PROPN|NameType=Sur|l-acl": 635,
1453
+ "PROPN|NameType=Sur|l-advcl": 636,
1454
+ "PROPN|NameType=Sur|l-amod": 637,
1455
+ "PROPN|NameType=Sur|l-compound": 638,
1456
+ "PROPN|NameType=Sur|l-csubj": 639,
1457
+ "PROPN|NameType=Sur|l-dislocated": 640,
1458
+ "PROPN|NameType=Sur|l-nmod": 641,
1459
+ "PROPN|NameType=Sur|l-nsubj": 642,
1460
+ "PROPN|NameType=Sur|l-nsubj:outer": 643,
1461
+ "PROPN|NameType=Sur|l-nsubj:pass": 644,
1462
+ "PROPN|NameType=Sur|l-obl": 645,
1463
+ "PROPN|NameType=Sur|l-obl:lmod": 646,
1464
+ "PROPN|NameType=Sur|l-vocative": 647,
1465
+ "PROPN|NameType=Sur|r-ccomp": 648,
1466
+ "PROPN|NameType=Sur|r-conj": 649,
1467
+ "PROPN|NameType=Sur|r-dislocated": 650,
1468
+ "PROPN|NameType=Sur|r-flat": 651,
1469
+ "PROPN|NameType=Sur|r-iobj": 652,
1470
+ "PROPN|NameType=Sur|r-list": 653,
1471
+ "PROPN|NameType=Sur|r-nmod": 654,
1472
+ "PROPN|NameType=Sur|r-nsubj": 655,
1473
+ "PROPN|NameType=Sur|r-obj": 656,
1474
+ "PROPN|NameType=Sur|r-obl": 657,
1475
+ "PROPN|NameType=Sur|r-obl:lmod": 658,
1476
+ "PROPN|NameType=Sur|r-parataxis": 659,
1477
+ "PROPN|NameType=Sur|r-xcomp": 660,
1478
+ "PROPN|NameType=Sur|root": 661,
1479
+ "PROPN|_": 662,
1480
+ "PROPN|l-nmod": 663,
1481
+ "PUNCT": 664,
1482
+ "PUNCT.": 665,
1483
+ "PUNCT|_": 666,
1484
+ "PUNCT|root": 667,
1485
+ "SCONJ": 668,
1486
+ "SCONJ.": 669,
1487
+ "SCONJ|_": 670,
1488
+ "SCONJ|l-case": 671,
1489
+ "SCONJ|l-cc": 672,
1490
+ "SCONJ|l-mark": 673,
1491
+ "SCONJ|l-nsubj": 674,
1492
+ "SCONJ|l-obl": 675,
1493
+ "SCONJ|r-case": 676,
1494
+ "SCONJ|r-iobj": 677,
1495
+ "SCONJ|r-mark": 678,
1496
+ "SCONJ|r-nsubj": 679,
1497
+ "SCONJ|r-nsubj:pass": 680,
1498
+ "SCONJ|r-obj": 681,
1499
+ "SCONJ|root": 682,
1500
+ "SYM": 683,
1501
+ "SYM.": 684,
1502
+ "SYM|_": 685,
1503
+ "SYM|l-nmod": 686,
1504
+ "SYM|l-nsubj": 687,
1505
+ "SYM|r-conj": 688,
1506
+ "SYM|r-nmod": 689,
1507
+ "SYM|r-xcomp": 690,
1508
+ "SYM|root": 691,
1509
+ "VERB": 692,
1510
+ "VERB.": 693,
1511
+ "VERB|Degree=Equ|VerbForm=Part|_": 694,
1512
+ "VERB|Degree=Equ|VerbForm=Part|l-amod": 695,
1513
+ "VERB|Degree=Equ|_": 696,
1514
+ "VERB|Degree=Equ|l-acl": 697,
1515
+ "VERB|Degree=Equ|l-advcl": 698,
1516
+ "VERB|Degree=Equ|l-ccomp": 699,
1517
+ "VERB|Degree=Equ|l-csubj": 700,
1518
+ "VERB|Degree=Equ|l-nsubj": 701,
1519
+ "VERB|Degree=Equ|l-obj": 702,
1520
+ "VERB|Degree=Equ|r-ccomp": 703,
1521
+ "VERB|Degree=Equ|r-compound:redup": 704,
1522
+ "VERB|Degree=Equ|r-conj": 705,
1523
+ "VERB|Degree=Equ|r-obj": 706,
1524
+ "VERB|Degree=Equ|r-parataxis": 707,
1525
+ "VERB|Degree=Equ|r-xcomp": 708,
1526
+ "VERB|Degree=Equ|root": 709,
1527
+ "VERB|Degree=Pos|VerbForm=Part|_": 710,
1528
+ "VERB|Degree=Pos|VerbForm=Part|l-amod": 711,
1529
+ "VERB|Degree=Pos|VerbForm=Part|r-amod": 712,
1530
+ "VERB|Degree=Pos|_": 713,
1531
+ "VERB|Degree=Pos|l-acl": 714,
1532
+ "VERB|Degree=Pos|l-advcl": 715,
1533
+ "VERB|Degree=Pos|l-ccomp": 716,
1534
+ "VERB|Degree=Pos|l-csubj": 717,
1535
+ "VERB|Degree=Pos|l-csubj:outer": 718,
1536
+ "VERB|Degree=Pos|l-dislocated": 719,
1537
+ "VERB|Degree=Pos|l-nsubj": 720,
1538
+ "VERB|Degree=Pos|l-nsubj:outer": 721,
1539
+ "VERB|Degree=Pos|l-obj": 722,
1540
+ "VERB|Degree=Pos|l-obl": 723,
1541
+ "VERB|Degree=Pos|l-vocative": 724,
1542
+ "VERB|Degree=Pos|r-advcl": 725,
1543
+ "VERB|Degree=Pos|r-ccomp": 726,
1544
+ "VERB|Degree=Pos|r-compound:redup": 727,
1545
+ "VERB|Degree=Pos|r-conj": 728,
1546
+ "VERB|Degree=Pos|r-dislocated": 729,
1547
+ "VERB|Degree=Pos|r-fixed": 730,
1548
+ "VERB|Degree=Pos|r-flat:vv": 731,
1549
+ "VERB|Degree=Pos|r-iobj": 732,
1550
+ "VERB|Degree=Pos|r-obj": 733,
1551
+ "VERB|Degree=Pos|r-obl": 734,
1552
+ "VERB|Degree=Pos|r-parataxis": 735,
1553
+ "VERB|Degree=Pos|r-xcomp": 736,
1554
+ "VERB|Degree=Pos|root": 737,
1555
+ "VERB|Polarity=Neg|VerbForm=Part|_": 738,
1556
+ "VERB|Polarity=Neg|VerbForm=Part|l-amod": 739,
1557
+ "VERB|Polarity=Neg|_": 740,
1558
+ "VERB|Polarity=Neg|l-acl": 741,
1559
+ "VERB|Polarity=Neg|l-advcl": 742,
1560
+ "VERB|Polarity=Neg|l-ccomp": 743,
1561
+ "VERB|Polarity=Neg|l-csubj": 744,
1562
+ "VERB|Polarity=Neg|l-csubj:outer": 745,
1563
+ "VERB|Polarity=Neg|l-nsubj": 746,
1564
+ "VERB|Polarity=Neg|l-obl": 747,
1565
+ "VERB|Polarity=Neg|r-advcl": 748,
1566
+ "VERB|Polarity=Neg|r-ccomp": 749,
1567
+ "VERB|Polarity=Neg|r-conj": 750,
1568
+ "VERB|Polarity=Neg|r-flat:vv": 751,
1569
+ "VERB|Polarity=Neg|r-obj": 752,
1570
+ "VERB|Polarity=Neg|r-obl": 753,
1571
+ "VERB|Polarity=Neg|r-parataxis": 754,
1572
+ "VERB|Polarity=Neg|r-xcomp": 755,
1573
+ "VERB|Polarity=Neg|root": 756,
1574
+ "VERB|VerbForm=Part|_": 757,
1575
+ "VERB|VerbForm=Part|l-amod": 758,
1576
+ "VERB|VerbForm=Part|r-amod": 759,
1577
+ "VERB|_": 760,
1578
+ "VERB|l-acl": 761,
1579
+ "VERB|l-advcl": 762,
1580
+ "VERB|l-ccomp": 763,
1581
+ "VERB|l-csubj": 764,
1582
+ "VERB|l-csubj:outer": 765,
1583
+ "VERB|l-csubj:pass": 766,
1584
+ "VERB|l-dislocated": 767,
1585
+ "VERB|l-nsubj": 768,
1586
+ "VERB|l-nsubj:outer": 769,
1587
+ "VERB|l-obj": 770,
1588
+ "VERB|l-obl": 771,
1589
+ "VERB|l-obl:lmod": 772,
1590
+ "VERB|l-parataxis": 773,
1591
+ "VERB|r-acl": 774,
1592
+ "VERB|r-advcl": 775,
1593
+ "VERB|r-ccomp": 776,
1594
+ "VERB|r-compound:redup": 777,
1595
+ "VERB|r-conj": 778,
1596
+ "VERB|r-dislocated": 779,
1597
+ "VERB|r-fixed": 780,
1598
+ "VERB|r-flat:vv": 781,
1599
+ "VERB|r-iobj": 782,
1600
+ "VERB|r-list": 783,
1601
+ "VERB|r-obj": 784,
1602
+ "VERB|r-obl": 785,
1603
+ "VERB|r-obl:lmod": 786,
1604
+ "VERB|r-parataxis": 787,
1605
+ "VERB|r-vocative": 788,
1606
+ "VERB|r-xcomp": 789,
1607
+ "VERB|root": 790
1608
+ },
1609
+ "layer_norm_eps": 1e-12,
1610
+ "lstm_dropout_prob": 0.5,
1611
+ "lstm_embedding_size": 768,
1612
+ "max_position_embeddings": 512,
1613
+ "model_type": "bert",
1614
+ "num_attention_heads": 12,
1615
+ "num_hidden_layers": 12,
1616
+ "pad_token_id": 0,
1617
+ "pooler_fc_size": 768,
1618
+ "pooler_num_attention_heads": 12,
1619
+ "pooler_num_fc_layers": 3,
1620
+ "pooler_size_per_head": 128,
1621
+ "pooler_type": "first_token_transform",
1622
+ "position_embedding_type": "absolute",
1623
+ "tokenizer_class": "BertTokenizerFast",
1624
+ "torch_dtype": "float32",
1625
+ "transformers_version": "4.48.3",
1626
+ "type_vocab_size": 2,
1627
+ "use_cache": true,
1628
+ "vocab_size": 38208
1629
+ }
maker.py ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #! /usr/bin/python3
2
+ src="Jihuai/bert-ancient-chinese"
3
+ tgt="KoichiYasuoka/bert-ancient-chinese-base-ud-embeds"
4
+ url="https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto"
5
+ import os
6
+ d=os.path.basename(url)
7
+ os.system("test -d "+d+" || git clone --depth=1 "+url)
8
+ os.system("for F in train dev test ; do cp "+d+"/*-$F.conllu $F.conllu ; done")
9
+ class UDEmbedsDataset(object):
10
+ def __init__(self,conllu,tokenizer,embeddings=None):
11
+ self.conllu=open(conllu,"r",encoding="utf-8")
12
+ self.tokenizer=tokenizer
13
+ self.embeddings=embeddings
14
+ self.seeks=[0]
15
+ label=set(["SYM","SYM.","SYM|_"])
16
+ dep=set()
17
+ s=self.conllu.readline()
18
+ while s!="":
19
+ if s=="\n":
20
+ self.seeks.append(self.conllu.tell())
21
+ else:
22
+ w=s.split("\t")
23
+ if len(w)==10:
24
+ if w[0].isdecimal():
25
+ p=w[3]
26
+ q="" if w[5]=="_" else "|"+w[5]
27
+ d=("|" if w[6]=="0" else "|l-" if int(w[0])<int(w[6]) else "|r-")+w[7]
28
+ for k in [p,p+".","B-"+p,"B-"+p+".","I-"+p,"I-"+p+".",p+q+"|_",p+q+d]:
29
+ label.add(k)
30
+ s=self.conllu.readline()
31
+ self.label2id={l:i for i,l in enumerate(sorted(label))}
32
+ def __call__(*args):
33
+ lid={l:i for i,l in enumerate(sorted(set(sum([list(t.label2id) for t in args],[]))))}
34
+ for t in args:
35
+ t.label2id=lid
36
+ return lid
37
+ def __del__(self):
38
+ self.conllu.close()
39
+ __len__=lambda self:(len(self.seeks)-1)*2
40
+ def __getitem__(self,i):
41
+ self.conllu.seek(self.seeks[int(i/2)])
42
+ z,c,t,s=i%2,[],[""],False
43
+ while t[0]!="\n":
44
+ t=self.conllu.readline().split("\t")
45
+ if len(t)==10 and t[0].isdecimal():
46
+ if s:
47
+ t[1]=" "+t[1]
48
+ c.append(t)
49
+ s=t[9].find("SpaceAfter=No")<0
50
+ x=[True if t[6]=="0" or int(t[6])>j or sum([1 if int(c[i][6])==j+1 else 0 for i in range(j+1,len(c))])>0 else False for j,t in enumerate(c)]
51
+ v=self.tokenizer([t[1] for t in c],add_special_tokens=False)["input_ids"]
52
+ if z==0:
53
+ ids,upos=[self.tokenizer.cls_token_id],["SYM."]
54
+ for i,(j,k) in enumerate(zip(v,c)):
55
+ if j==[]:
56
+ j=[self.tokenizer.unk_token_id]
57
+ p=k[3] if x[i] else k[3]+"."
58
+ ids+=j
59
+ upos+=[p] if len(j)==1 else ["B-"+p]+["I-"+p]*(len(j)-1)
60
+ ids.append(self.tokenizer.sep_token_id)
61
+ upos.append("SYM.")
62
+ emb=self.embeddings
63
+ else:
64
+ import torch
65
+ if len(x)<31:
66
+ x=[True]*len(x)
67
+ w=(len(x)+1)*(len(x)+2)/2
68
+ else:
69
+ w=sum([len(x)-i+1 if b else 0 for i,b in enumerate(x)])+1
70
+ for i in range(len(x)):
71
+ if x[i]==False and w+len(x)-i<512:
72
+ x[i]=True
73
+ w+=len(x)-i+1
74
+ p=[t[3] if t[5]=="_" else t[3]+"|"+t[5] for i,t in enumerate(c)]
75
+ d=[t[7] if t[6]=="0" else "l-"+t[7] if int(t[0])<int(t[6]) else "r-"+t[7] for t in c]
76
+ ids,upos=[-1],["SYM|_"]
77
+ for i in range(len(x)):
78
+ if x[i]:
79
+ ids.append(i)
80
+ upos.append(p[i]+"|"+d[i] if c[i][6]=="0" else p[i]+"|_")
81
+ for j in range(i+1,len(x)):
82
+ ids.append(j)
83
+ upos.append(p[j]+"|"+d[j] if int(c[j][6])==i+1 else p[i]+"|"+d[i] if int(c[i][6])==j+1 else p[j]+"|_")
84
+ if i>0 and w>512:
85
+ while w>512:
86
+ if upos[-1].endswith("|_"):
87
+ upos.pop(-1)
88
+ ids.pop(-1)
89
+ w-=1
90
+ else:
91
+ break
92
+ ids.append(-1)
93
+ upos.append("SYM|_")
94
+ with torch.no_grad():
95
+ m=[]
96
+ for j in v:
97
+ if j==[]:
98
+ j=[self.tokenizer.unk_token_id]
99
+ m.append(self.embeddings[j,:].sum(axis=0))
100
+ m.append(self.embeddings[self.tokenizer.sep_token_id,:])
101
+ emb=torch.stack(m)
102
+ return{"inputs_embeds":emb[ids[:512],:],"labels":[self.label2id[p] for p in upos[:512]]}
103
+ from transformers import AutoTokenizer,AutoConfig,AutoModelForTokenClassification,DefaultDataCollator,TrainingArguments,Trainer
104
+ from tokenizers.pre_tokenizers import Sequence,Split
105
+ from tokenizers import Regex
106
+ tkz=AutoTokenizer.from_pretrained(src)
107
+ trainDS=UDEmbedsDataset("train.conllu",tkz)
108
+ devDS=UDEmbedsDataset("dev.conllu",tkz)
109
+ testDS=UDEmbedsDataset("test.conllu",tkz)
110
+ lid=trainDS(devDS,testDS)
111
+ cfg=AutoConfig.from_pretrained(src,num_labels=len(lid),label2id=lid,id2label={i:l for l,i in lid.items()},ignore_mismatched_sizes=True,trust_remote_code=True)
112
+ mdl=AutoModelForTokenClassification.from_pretrained(src,config=cfg,ignore_mismatched_sizes=True,trust_remote_code=True)
113
+ trainDS.embeddings=mdl.get_input_embeddings().weight
114
+ arg=TrainingArguments(num_train_epochs=3,per_device_train_batch_size=1,dataloader_pin_memory=False,output_dir=tgt,overwrite_output_dir=True,save_total_limit=2,learning_rate=5e-05,warmup_ratio=0.1,save_safetensors=False)
115
+ trn=Trainer(args=arg,data_collator=DefaultDataCollator(),model=mdl,train_dataset=trainDS)
116
+ trn.train()
117
+ trn.save_model(tgt)
118
+ tkz.save_pretrained(tgt)
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8224e0557b2a5284bb4c479878bcfe4235c163850bd6d98e420a96dafc6440f1
3
+ size 461679270
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_input_names": [
51
+ "input_ids",
52
+ "attention_mask"
53
+ ],
54
+ "model_max_length": 512,
55
+ "never_split": null,
56
+ "pad_token": "[PAD]",
57
+ "sep_token": "[SEP]",
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizerFast",
61
+ "unk_token": "[UNK]"
62
+ }
ud.py ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy
2
+ from transformers import TokenClassificationPipeline
3
+
4
+ class BellmanFordTokenClassificationPipeline(TokenClassificationPipeline):
5
+ def __init__(self,**kwargs):
6
+ super().__init__(**kwargs)
7
+ x=self.model.config.label2id
8
+ y=[k for k in x if k.find("|")<0 and not k.startswith("I-")]
9
+ self.transition=numpy.full((len(x),len(x)),-numpy.inf)
10
+ for k,v in x.items():
11
+ if k.find("|")<0:
12
+ for j in ["I-"+k[2:]] if k.startswith("B-") else [k]+y if k.startswith("I-") else y:
13
+ self.transition[v,x[j]]=0
14
+ def check_model_type(self,supported_models):
15
+ pass
16
+ def postprocess(self,model_outputs,**kwargs):
17
+ if "logits" not in model_outputs:
18
+ return self.postprocess(model_outputs[0],**kwargs)
19
+ return self.bellman_ford_token_classification(model_outputs,**kwargs)
20
+ def bellman_ford_token_classification(self,model_outputs,**kwargs):
21
+ m=model_outputs["logits"][0].numpy()
22
+ e=numpy.exp(m-numpy.max(m,axis=-1,keepdims=True))
23
+ z=e/e.sum(axis=-1,keepdims=True)
24
+ for i in range(m.shape[0]-1,0,-1):
25
+ m[i-1]+=numpy.max(m[i]+self.transition,axis=1)
26
+ k=[numpy.argmax(m[0]+self.transition[0])]
27
+ for i in range(1,m.shape[0]):
28
+ k.append(numpy.argmax(m[i]+self.transition[k[-1]]))
29
+ w=[{"entity":self.model.config.id2label[j],"start":s,"end":e,"score":z[i,j]} for i,((s,e),j) in enumerate(zip(model_outputs["offset_mapping"][0].tolist(),k)) if s<e]
30
+ if "aggregation_strategy" in kwargs and kwargs["aggregation_strategy"]!="none":
31
+ for i,t in reversed(list(enumerate(w))):
32
+ p=t.pop("entity")
33
+ if p.startswith("I-"):
34
+ w[i-1]["score"]=min(w[i-1]["score"],t["score"])
35
+ w[i-1]["end"]=w.pop(i)["end"]
36
+ elif p.startswith("B-"):
37
+ t["entity_group"]=p[2:]
38
+ else:
39
+ t["entity_group"]=p
40
+ for t in w:
41
+ t["text"]=model_outputs["sentence"][t["start"]:t["end"]]
42
+ return w
43
+
44
+ class UniversalDependenciesPipeline(BellmanFordTokenClassificationPipeline):
45
+ def __init__(self,**kwargs):
46
+ kwargs["aggregation_strategy"]="simple"
47
+ super().__init__(**kwargs)
48
+ x=self.model.config.label2id
49
+ self.root=numpy.full((len(x)),-numpy.inf)
50
+ self.left_arc=numpy.full((len(x)),-numpy.inf)
51
+ self.right_arc=numpy.full((len(x)),-numpy.inf)
52
+ for k,v in x.items():
53
+ if k.endswith("|root"):
54
+ self.root[v]=0
55
+ elif k.find("|l-")>0:
56
+ self.left_arc[v]=0
57
+ elif k.find("|r-")>0:
58
+ self.right_arc[v]=0
59
+ def postprocess(self,model_outputs,**kwargs):
60
+ import torch
61
+ kwargs["aggregation_strategy"]="simple"
62
+ if "logits" not in model_outputs:
63
+ return self.postprocess(model_outputs[0],**kwargs)
64
+ w=self.bellman_ford_token_classification(model_outputs,**kwargs)
65
+ off=[(t["start"],t["end"]) for t in w]
66
+ for i,(s,e) in reversed(list(enumerate(off))):
67
+ if s<e:
68
+ d=w[i]["text"]
69
+ j=len(d)-len(d.lstrip())
70
+ if j>0:
71
+ d=d.lstrip()
72
+ off[i]=(off[i][0]+j,off[i][1])
73
+ j=len(d)-len(d.rstrip())
74
+ if j>0:
75
+ d=d.rstrip()
76
+ off[i]=(off[i][0],off[i][1]-j)
77
+ if d.strip()=="":
78
+ off.pop(i)
79
+ w.pop(i)
80
+ v=self.tokenizer([t["text"] for t in w],add_special_tokens=False)
81
+ x=[not t["entity_group"].endswith(".") for t in w]
82
+ if len(x)<30:
83
+ x=[True]*len(x)
84
+ else:
85
+ k=sum([len(x)-i+1 if b else 0 for i,b in enumerate(x)])+1
86
+ for i in numpy.argsort(numpy.array([t["score"] for t in w])):
87
+ if x[i]==False and k+len(x)-i<512:
88
+ x[i]=True
89
+ k+=len(x)-i+1
90
+ ids=[-1]
91
+ for i in range(len(x)):
92
+ if x[i]:
93
+ ids.append(i)
94
+ for j in range(i+1,len(x)):
95
+ ids.append(j)
96
+ ids.append(-1)
97
+ with torch.no_grad():
98
+ e=self.model.get_input_embeddings().weight
99
+ m=[]
100
+ for j in v["input_ids"]:
101
+ if j==[]:
102
+ j=[self.tokenizer.unk_token_id]
103
+ m.append(e[j,:].sum(axis=0))
104
+ m.append(e[self.tokenizer.sep_token_id,:])
105
+ m=torch.stack(m).to(self.device)
106
+ e=self.model(inputs_embeds=torch.unsqueeze(m[ids,:],0))
107
+ m=e.logits[0].cpu().numpy()
108
+ e=numpy.full((len(x),len(x),m.shape[-1]),m.min())
109
+ k=1
110
+ for i in range(len(x)):
111
+ if x[i]:
112
+ e[i,i]=m[k]+self.root
113
+ k+=1
114
+ for j in range(1,len(x)-i):
115
+ e[i+j,i]=m[k]+self.left_arc
116
+ e[i,i+j]=m[k]+self.right_arc
117
+ k+=1
118
+ k+=1
119
+ m,p=numpy.max(e,axis=2),numpy.argmax(e,axis=2)
120
+ h=self.chu_liu_edmonds(m)
121
+ z=[i for i,j in enumerate(h) if i==j]
122
+ if len(z)>1:
123
+ k,h=z[numpy.argmax(m[z,z])],numpy.min(m)-numpy.max(m)
124
+ m[:,z]+=[[0 if j in z and (i!=j or i==k) else h for i in z] for j in range(m.shape[0])]
125
+ h=self.chu_liu_edmonds(m)
126
+ q=[self.model.config.id2label[p[j,i]].split("|") for i,j in enumerate(h)]
127
+ t=model_outputs["sentence"].replace("\n"," ")
128
+ u="# text = "+t+"\n"
129
+ for i,(s,e) in enumerate(off):
130
+ u+="\t".join([str(i+1),t[s:e],t[s:e],q[i][0],"_","_" if len(q[i])<3 else "|".join(q[i][1:-1]),str(0 if h[i]==i else h[i]+1),"root" if q[i][-1]=="root" else q[i][-1][2:],"_","_" if i+1<len(off) and e<off[i+1][0] else "SpaceAfter=No"])+"\n"
131
+ return u+"\n"
132
+ def chu_liu_edmonds(self,matrix):
133
+ h=numpy.argmax(matrix,axis=0)
134
+ x=[-1 if i==j else j for i,j in enumerate(h)]
135
+ for b in [lambda x,i,j:-1 if i not in x else x[i],lambda x,i,j:-1 if j<0 else x[j]]:
136
+ y=[]
137
+ while x!=y:
138
+ y=list(x)
139
+ for i,j in enumerate(x):
140
+ x[i]=b(x,i,j)
141
+ if max(x)<0:
142
+ return h
143
+ y,x=[i for i,j in enumerate(x) if j==max(x)],[i for i,j in enumerate(x) if j<max(x)]
144
+ z=matrix-numpy.max(matrix,axis=0)
145
+ m=numpy.block([[z[x,:][:,x],numpy.max(z[x,:][:,y],axis=1).reshape(len(x),1)],[numpy.max(z[y,:][:,x],axis=0),numpy.max(z[y,y])]])
146
+ k=[j if i==len(x) else x[j] if j<len(x) else y[numpy.argmax(z[y,x[i]])] for i,j in enumerate(self.chu_liu_edmonds(m))]
147
+ h=[j if i in y else k[x.index(i)] for i,j in enumerate(h)]
148
+ i=y[numpy.argmax(z[x[k[-1]],y] if k[-1]<len(x) else z[y,y])]
149
+ h[i]=x[k[-1]] if k[-1]<len(x) else i
150
+ return h
vocab.txt ADDED
The diff for this file is too large to render. See raw diff