KoichiYasuoka commited on
Commit
dce55ff
·
1 Parent(s): c79d58f

initial release

Browse files
Files changed (9) hide show
  1. README.md +33 -0
  2. config.json +1643 -0
  3. maker.py +118 -0
  4. pytorch_model.bin +3 -0
  5. special_tokens_map.json +37 -0
  6. tokenizer.json +0 -0
  7. tokenizer_config.json +68 -0
  8. ud.py +150 -0
  9. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - "lzh"
4
+ tags:
5
+ - "classical chinese"
6
+ - "literary chinese"
7
+ - "ancient chinese"
8
+ - "token-classification"
9
+ - "pos"
10
+ - "dependency-parsing"
11
+ base_model: KoichiYasuoka/modernbert-large-classical-chinese
12
+ datasets:
13
+ - "universal_dependencies"
14
+ license: "apache-2.0"
15
+ pipeline_tag: "token-classification"
16
+ widget:
17
+ - text: "孟子見梁惠王"
18
+ ---
19
+
20
+ # modernbert-large-classical-chinese-ud-embeds
21
+
22
+ ## Model Description
23
+
24
+ This is a ModernBERT model pre-trained on Classical Chinese texts for POS-tagging and dependency-parsing, derived from [modernbert-large-classical-chinese](https://huggingface.co/KoichiYasuoka/modernbert-large-classical-chinese) and [UD_Classical_Chinese-Kyoto](https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto).
25
+
26
+ ## How to Use
27
+
28
+ ```py
29
+ from transformers import pipeline
30
+ nlp=pipeline("universal-dependencies","KoichiYasuoka/modernbert-large-classical-chinese-ud-embeds",trust_remote_code=True)
31
+ print(nlp("孟子見梁惠王"))
32
+ ```
33
+
config.json ADDED
@@ -0,0 +1,1643 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertForTokenClassification"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 0,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 0,
13
+ "custom_pipelines": {
14
+ "upos": {
15
+ "impl": "ud.BellmanFordTokenClassificationPipeline",
16
+ "pt": "AutoModelForTokenClassification"
17
+ },
18
+ "universal-dependencies": {
19
+ "impl": "ud.UniversalDependenciesPipeline",
20
+ "pt": "AutoModelForTokenClassification"
21
+ }
22
+ },
23
+ "decoder_bias": true,
24
+ "deterministic_flash_attn": false,
25
+ "embedding_dropout": 0.0,
26
+ "eos_token_id": 2,
27
+ "global_attn_every_n_layers": 3,
28
+ "global_rope_theta": 160000.0,
29
+ "gradient_checkpointing": false,
30
+ "hidden_activation": "gelu",
31
+ "hidden_size": 1024,
32
+ "id2label": {
33
+ "0": "ADP",
34
+ "1": "ADP.",
35
+ "2": "ADP|Degree=Equ|_",
36
+ "3": "ADP|Degree=Equ|l-cc",
37
+ "4": "ADP|_",
38
+ "5": "ADP|l-acl",
39
+ "6": "ADP|l-advcl",
40
+ "7": "ADP|l-amod",
41
+ "8": "ADP|l-case",
42
+ "9": "ADP|l-cc",
43
+ "10": "ADP|l-mark",
44
+ "11": "ADP|l-nsubj",
45
+ "12": "ADP|l-obl",
46
+ "13": "ADP|r-case",
47
+ "14": "ADP|r-conj",
48
+ "15": "ADP|r-fixed",
49
+ "16": "ADP|r-mark",
50
+ "17": "ADP|r-obj",
51
+ "18": "ADP|root",
52
+ "19": "ADV",
53
+ "20": "ADV.",
54
+ "21": "ADV|AdvType=Cau|_",
55
+ "22": "ADV|AdvType=Cau|l-advmod",
56
+ "23": "ADV|AdvType=Cau|l-amod",
57
+ "24": "ADV|AdvType=Cau|l-nsubj",
58
+ "25": "ADV|AdvType=Cau|l-obj",
59
+ "26": "ADV|AdvType=Deg|Degree=Cmp|_",
60
+ "27": "ADV|AdvType=Deg|Degree=Cmp|l-advmod",
61
+ "28": "ADV|AdvType=Deg|Degree=Cmp|l-amod",
62
+ "29": "ADV|AdvType=Deg|Degree=Cmp|r-conj",
63
+ "30": "ADV|AdvType=Deg|Degree=Cmp|r-obj",
64
+ "31": "ADV|AdvType=Deg|Degree=Pos|_",
65
+ "32": "ADV|AdvType=Deg|Degree=Pos|l-advmod",
66
+ "33": "ADV|AdvType=Deg|Degree=Pos|l-amod",
67
+ "34": "ADV|AdvType=Deg|Degree=Pos|r-ccomp",
68
+ "35": "ADV|AdvType=Deg|Degree=Pos|r-conj",
69
+ "36": "ADV|AdvType=Deg|Degree=Pos|r-flat:vv",
70
+ "37": "ADV|AdvType=Deg|Degree=Pos|r-parataxis",
71
+ "38": "ADV|AdvType=Deg|Degree=Pos|root",
72
+ "39": "ADV|AdvType=Deg|Degree=Sup|_",
73
+ "40": "ADV|AdvType=Deg|Degree=Sup|l-advmod",
74
+ "41": "ADV|AdvType=Deg|Degree=Sup|l-amod",
75
+ "42": "ADV|AdvType=Deg|Degree=Sup|l-nsubj",
76
+ "43": "ADV|AdvType=Deg|Degree=Sup|r-conj",
77
+ "44": "ADV|AdvType=Deg|Degree=Sup|r-parataxis",
78
+ "45": "ADV|AdvType=Deg|Degree=Sup|root",
79
+ "46": "ADV|AdvType=Tim|Aspect=Perf|_",
80
+ "47": "ADV|AdvType=Tim|Aspect=Perf|l-advmod",
81
+ "48": "ADV|AdvType=Tim|Aspect=Perf|l-amod",
82
+ "49": "ADV|AdvType=Tim|Aspect=Perf|l-obl:lmod",
83
+ "50": "ADV|AdvType=Tim|Aspect=Perf|r-parataxis",
84
+ "51": "ADV|AdvType=Tim|Aspect=Perf|root",
85
+ "52": "ADV|AdvType=Tim|Tense=Fut|_",
86
+ "53": "ADV|AdvType=Tim|Tense=Fut|l-advmod",
87
+ "54": "ADV|AdvType=Tim|Tense=Fut|l-amod",
88
+ "55": "ADV|AdvType=Tim|Tense=Fut|l-nsubj",
89
+ "56": "ADV|AdvType=Tim|Tense=Fut|l-nsubj:outer",
90
+ "57": "ADV|AdvType=Tim|Tense=Fut|root",
91
+ "58": "ADV|AdvType=Tim|Tense=Past|_",
92
+ "59": "ADV|AdvType=Tim|Tense=Past|l-advmod",
93
+ "60": "ADV|AdvType=Tim|Tense=Past|l-amod",
94
+ "61": "ADV|AdvType=Tim|Tense=Pres|_",
95
+ "62": "ADV|AdvType=Tim|Tense=Pres|l-advmod",
96
+ "63": "ADV|AdvType=Tim|Tense=Pres|l-amod",
97
+ "64": "ADV|AdvType=Tim|Tense=Pres|root",
98
+ "65": "ADV|AdvType=Tim|_",
99
+ "66": "ADV|AdvType=Tim|l-advcl",
100
+ "67": "ADV|AdvType=Tim|l-advmod",
101
+ "68": "ADV|AdvType=Tim|l-amod",
102
+ "69": "ADV|AdvType=Tim|l-nsubj",
103
+ "70": "ADV|AdvType=Tim|r-advmod",
104
+ "71": "ADV|AdvType=Tim|r-ccomp",
105
+ "72": "ADV|AdvType=Tim|r-compound:redup",
106
+ "73": "ADV|AdvType=Tim|r-conj",
107
+ "74": "ADV|AdvType=Tim|r-flat:vv",
108
+ "75": "ADV|AdvType=Tim|r-parataxis",
109
+ "76": "ADV|AdvType=Tim|root",
110
+ "77": "ADV|Degree=Equ|VerbForm=Conv|_",
111
+ "78": "ADV|Degree=Equ|VerbForm=Conv|l-advmod",
112
+ "79": "ADV|Degree=Pos|VerbForm=Conv|_",
113
+ "80": "ADV|Degree=Pos|VerbForm=Conv|l-advmod",
114
+ "81": "ADV|Degree=Pos|VerbForm=Conv|r-advmod",
115
+ "82": "ADV|Polarity=Neg|VerbForm=Conv|_",
116
+ "83": "ADV|Polarity=Neg|VerbForm=Conv|l-advmod",
117
+ "84": "ADV|Polarity=Neg|_",
118
+ "85": "ADV|Polarity=Neg|l-advmod",
119
+ "86": "ADV|Polarity=Neg|l-amod",
120
+ "87": "ADV|Polarity=Neg|l-nsubj",
121
+ "88": "ADV|Polarity=Neg|l-parataxis",
122
+ "89": "ADV|Polarity=Neg|r-advmod",
123
+ "90": "ADV|Polarity=Neg|r-conj",
124
+ "91": "ADV|Polarity=Neg|r-obj",
125
+ "92": "ADV|Polarity=Neg|r-parataxis",
126
+ "93": "ADV|Polarity=Neg|root",
127
+ "94": "ADV|VerbForm=Conv|_",
128
+ "95": "ADV|VerbForm=Conv|l-advmod",
129
+ "96": "ADV|VerbForm=Conv|r-advmod",
130
+ "97": "ADV|_",
131
+ "98": "ADV|l-acl",
132
+ "99": "ADV|l-advcl",
133
+ "100": "ADV|l-advmod",
134
+ "101": "ADV|l-amod",
135
+ "102": "ADV|l-cc",
136
+ "103": "ADV|l-nsubj",
137
+ "104": "ADV|r-advmod",
138
+ "105": "ADV|r-ccomp",
139
+ "106": "ADV|r-conj",
140
+ "107": "ADV|r-flat:vv",
141
+ "108": "ADV|r-obj",
142
+ "109": "ADV|root",
143
+ "110": "AUX",
144
+ "111": "AUX.",
145
+ "112": "AUX|Mood=Des|_",
146
+ "113": "AUX|Mood=Des|l-aux",
147
+ "114": "AUX|Mood=Des|l-csubj",
148
+ "115": "AUX|Mood=Des|l-parataxis",
149
+ "116": "AUX|Mood=Des|r-ccomp",
150
+ "117": "AUX|Mood=Des|r-conj",
151
+ "118": "AUX|Mood=Des|r-flat:vv",
152
+ "119": "AUX|Mood=Des|root",
153
+ "120": "AUX|Mood=Nec|_",
154
+ "121": "AUX|Mood=Nec|l-acl",
155
+ "122": "AUX|Mood=Nec|l-amod",
156
+ "123": "AUX|Mood=Nec|l-aux",
157
+ "124": "AUX|Mood=Nec|r-aux",
158
+ "125": "AUX|Mood=Nec|root",
159
+ "126": "AUX|Mood=Pot|_",
160
+ "127": "AUX|Mood=Pot|l-acl",
161
+ "128": "AUX|Mood=Pot|l-advcl",
162
+ "129": "AUX|Mood=Pot|l-amod",
163
+ "130": "AUX|Mood=Pot|l-aux",
164
+ "131": "AUX|Mood=Pot|l-csubj",
165
+ "132": "AUX|Mood=Pot|l-nsubj",
166
+ "133": "AUX|Mood=Pot|r-ccomp",
167
+ "134": "AUX|Mood=Pot|r-conj",
168
+ "135": "AUX|Mood=Pot|r-obj",
169
+ "136": "AUX|Mood=Pot|r-parataxis",
170
+ "137": "AUX|Mood=Pot|r-xcomp",
171
+ "138": "AUX|Mood=Pot|root",
172
+ "139": "AUX|VerbType=Cop|_",
173
+ "140": "AUX|VerbType=Cop|l-cop",
174
+ "141": "AUX|Voice=Pass|_",
175
+ "142": "AUX|Voice=Pass|l-aux",
176
+ "143": "AUX|Voice=Pass|r-conj",
177
+ "144": "AUX|Voice=Pass|root",
178
+ "145": "B-ADP",
179
+ "146": "B-ADP.",
180
+ "147": "B-ADV",
181
+ "148": "B-ADV.",
182
+ "149": "B-AUX",
183
+ "150": "B-AUX.",
184
+ "151": "B-CCONJ",
185
+ "152": "B-CCONJ.",
186
+ "153": "B-INTJ",
187
+ "154": "B-INTJ.",
188
+ "155": "B-NOUN",
189
+ "156": "B-NOUN.",
190
+ "157": "B-NUM",
191
+ "158": "B-NUM.",
192
+ "159": "B-PART",
193
+ "160": "B-PART.",
194
+ "161": "B-PRON",
195
+ "162": "B-PRON.",
196
+ "163": "B-PROPN",
197
+ "164": "B-PROPN.",
198
+ "165": "B-PUNCT",
199
+ "166": "B-PUNCT.",
200
+ "167": "B-SCONJ",
201
+ "168": "B-SCONJ.",
202
+ "169": "B-SYM",
203
+ "170": "B-SYM.",
204
+ "171": "B-VERB",
205
+ "172": "B-VERB.",
206
+ "173": "CCONJ",
207
+ "174": "CCONJ.",
208
+ "175": "CCONJ|_",
209
+ "176": "CCONJ|l-advmod",
210
+ "177": "CCONJ|l-amod",
211
+ "178": "CCONJ|l-cc",
212
+ "179": "CCONJ|l-obj",
213
+ "180": "CCONJ|r-fixed",
214
+ "181": "CCONJ|r-orphan",
215
+ "182": "I-ADP",
216
+ "183": "I-ADP.",
217
+ "184": "I-ADV",
218
+ "185": "I-ADV.",
219
+ "186": "I-AUX",
220
+ "187": "I-AUX.",
221
+ "188": "I-CCONJ",
222
+ "189": "I-CCONJ.",
223
+ "190": "I-INTJ",
224
+ "191": "I-INTJ.",
225
+ "192": "I-NOUN",
226
+ "193": "I-NOUN.",
227
+ "194": "I-NUM",
228
+ "195": "I-NUM.",
229
+ "196": "I-PART",
230
+ "197": "I-PART.",
231
+ "198": "I-PRON",
232
+ "199": "I-PRON.",
233
+ "200": "I-PROPN",
234
+ "201": "I-PROPN.",
235
+ "202": "I-PUNCT",
236
+ "203": "I-PUNCT.",
237
+ "204": "I-SCONJ",
238
+ "205": "I-SCONJ.",
239
+ "206": "I-SYM",
240
+ "207": "I-SYM.",
241
+ "208": "I-VERB",
242
+ "209": "I-VERB.",
243
+ "210": "INTJ",
244
+ "211": "INTJ.",
245
+ "212": "INTJ|_",
246
+ "213": "INTJ|l-advcl",
247
+ "214": "INTJ|l-csubj",
248
+ "215": "INTJ|l-discourse",
249
+ "216": "INTJ|l-discourse:sp",
250
+ "217": "INTJ|l-dislocated",
251
+ "218": "INTJ|l-nsubj",
252
+ "219": "INTJ|l-vocative",
253
+ "220": "INTJ|r-compound:redup",
254
+ "221": "INTJ|r-conj",
255
+ "222": "INTJ|r-discourse:sp",
256
+ "223": "INTJ|r-dislocated",
257
+ "224": "INTJ|r-fixed",
258
+ "225": "INTJ|r-obj",
259
+ "226": "INTJ|r-parataxis",
260
+ "227": "INTJ|root",
261
+ "228": "NOUN",
262
+ "229": "NOUN.",
263
+ "230": "NOUN|Case=Loc|_",
264
+ "231": "NOUN|Case=Loc|l-acl",
265
+ "232": "NOUN|Case=Loc|l-advcl",
266
+ "233": "NOUN|Case=Loc|l-amod",
267
+ "234": "NOUN|Case=Loc|l-clf",
268
+ "235": "NOUN|Case=Loc|l-compound",
269
+ "236": "NOUN|Case=Loc|l-csubj",
270
+ "237": "NOUN|Case=Loc|l-dislocated",
271
+ "238": "NOUN|Case=Loc|l-nmod",
272
+ "239": "NOUN|Case=Loc|l-nsubj",
273
+ "240": "NOUN|Case=Loc|l-nsubj:outer",
274
+ "241": "NOUN|Case=Loc|l-obj",
275
+ "242": "NOUN|Case=Loc|l-obl",
276
+ "243": "NOUN|Case=Loc|l-obl:lmod",
277
+ "244": "NOUN|Case=Loc|l-obl:tmod",
278
+ "245": "NOUN|Case=Loc|l-parataxis",
279
+ "246": "NOUN|Case=Loc|r-ccomp",
280
+ "247": "NOUN|Case=Loc|r-clf",
281
+ "248": "NOUN|Case=Loc|r-compound:redup",
282
+ "249": "NOUN|Case=Loc|r-conj",
283
+ "250": "NOUN|Case=Loc|r-dislocated",
284
+ "251": "NOUN|Case=Loc|r-flat",
285
+ "252": "NOUN|Case=Loc|r-iobj",
286
+ "253": "NOUN|Case=Loc|r-list",
287
+ "254": "NOUN|Case=Loc|r-nmod",
288
+ "255": "NOUN|Case=Loc|r-nsubj",
289
+ "256": "NOUN|Case=Loc|r-obj",
290
+ "257": "NOUN|Case=Loc|r-obl",
291
+ "258": "NOUN|Case=Loc|r-obl:lmod",
292
+ "259": "NOUN|Case=Loc|r-parataxis",
293
+ "260": "NOUN|Case=Loc|r-xcomp",
294
+ "261": "NOUN|Case=Loc|root",
295
+ "262": "NOUN|Case=Tem|_",
296
+ "263": "NOUN|Case=Tem|l-acl",
297
+ "264": "NOUN|Case=Tem|l-advcl",
298
+ "265": "NOUN|Case=Tem|l-amod",
299
+ "266": "NOUN|Case=Tem|l-compound",
300
+ "267": "NOUN|Case=Tem|l-csubj",
301
+ "268": "NOUN|Case=Tem|l-nmod",
302
+ "269": "NOUN|Case=Tem|l-nsubj",
303
+ "270": "NOUN|Case=Tem|l-nsubj:outer",
304
+ "271": "NOUN|Case=Tem|l-obj",
305
+ "272": "NOUN|Case=Tem|l-obl:tmod",
306
+ "273": "NOUN|Case=Tem|r-amod",
307
+ "274": "NOUN|Case=Tem|r-ccomp",
308
+ "275": "NOUN|Case=Tem|r-clf",
309
+ "276": "NOUN|Case=Tem|r-compound:redup",
310
+ "277": "NOUN|Case=Tem|r-conj",
311
+ "278": "NOUN|Case=Tem|r-flat",
312
+ "279": "NOUN|Case=Tem|r-iobj",
313
+ "280": "NOUN|Case=Tem|r-list",
314
+ "281": "NOUN|Case=Tem|r-nsubj",
315
+ "282": "NOUN|Case=Tem|r-obj",
316
+ "283": "NOUN|Case=Tem|r-obl:tmod",
317
+ "284": "NOUN|Case=Tem|r-parataxis",
318
+ "285": "NOUN|Case=Tem|r-xcomp",
319
+ "286": "NOUN|Case=Tem|root",
320
+ "287": "NOUN|Degree=Pos|_",
321
+ "288": "NOUN|Degree=Pos|root",
322
+ "289": "NOUN|NounType=Clf|_",
323
+ "290": "NOUN|NounType=Clf|l-clf",
324
+ "291": "NOUN|NounType=Clf|l-nmod",
325
+ "292": "NOUN|NounType=Clf|l-nsubj",
326
+ "293": "NOUN|NounType=Clf|l-obl",
327
+ "294": "NOUN|NounType=Clf|r-ccomp",
328
+ "295": "NOUN|NounType=Clf|r-clf",
329
+ "296": "NOUN|NounType=Clf|r-compound:redup",
330
+ "297": "NOUN|NounType=Clf|r-conj",
331
+ "298": "NOUN|NounType=Clf|r-flat",
332
+ "299": "NOUN|NounType=Clf|r-obj",
333
+ "300": "NOUN|NounType=Clf|r-parataxis",
334
+ "301": "NOUN|NounType=Clf|root",
335
+ "302": "NOUN|_",
336
+ "303": "NOUN|l-acl",
337
+ "304": "NOUN|l-advcl",
338
+ "305": "NOUN|l-amod",
339
+ "306": "NOUN|l-ccomp",
340
+ "307": "NOUN|l-clf",
341
+ "308": "NOUN|l-compound",
342
+ "309": "NOUN|l-csubj",
343
+ "310": "NOUN|l-csubj:outer",
344
+ "311": "NOUN|l-dislocated",
345
+ "312": "NOUN|l-iobj",
346
+ "313": "NOUN|l-list",
347
+ "314": "NOUN|l-nmod",
348
+ "315": "NOUN|l-nsubj",
349
+ "316": "NOUN|l-nsubj:outer",
350
+ "317": "NOUN|l-nsubj:pass",
351
+ "318": "NOUN|l-obj",
352
+ "319": "NOUN|l-obl",
353
+ "320": "NOUN|l-obl:lmod",
354
+ "321": "NOUN|l-obl:tmod",
355
+ "322": "NOUN|l-vocative",
356
+ "323": "NOUN|r-acl",
357
+ "324": "NOUN|r-advcl",
358
+ "325": "NOUN|r-amod",
359
+ "326": "NOUN|r-ccomp",
360
+ "327": "NOUN|r-clf",
361
+ "328": "NOUN|r-compound:redup",
362
+ "329": "NOUN|r-conj",
363
+ "330": "NOUN|r-csubj",
364
+ "331": "NOUN|r-dislocated",
365
+ "332": "NOUN|r-flat",
366
+ "333": "NOUN|r-flat:foreign",
367
+ "334": "NOUN|r-iobj",
368
+ "335": "NOUN|r-list",
369
+ "336": "NOUN|r-nmod",
370
+ "337": "NOUN|r-nsubj",
371
+ "338": "NOUN|r-obj",
372
+ "339": "NOUN|r-obl",
373
+ "340": "NOUN|r-obl:lmod",
374
+ "341": "NOUN|r-parataxis",
375
+ "342": "NOUN|r-vocative",
376
+ "343": "NOUN|r-xcomp",
377
+ "344": "NOUN|root",
378
+ "345": "NUM",
379
+ "346": "NUM.",
380
+ "347": "NUM|NumType=Ord|_",
381
+ "348": "NUM|NumType=Ord|l-nsubj",
382
+ "349": "NUM|NumType=Ord|l-nummod",
383
+ "350": "NUM|NumType=Ord|l-obl",
384
+ "351": "NUM|NumType=Ord|l-obl:lmod",
385
+ "352": "NUM|NumType=Ord|l-obl:tmod",
386
+ "353": "NUM|NumType=Ord|r-conj",
387
+ "354": "NUM|NumType=Ord|r-flat",
388
+ "355": "NUM|NumType=Ord|r-obj",
389
+ "356": "NUM|NumType=Ord|root",
390
+ "357": "NUM|_",
391
+ "358": "NUM|l-acl",
392
+ "359": "NUM|l-advcl",
393
+ "360": "NUM|l-compound",
394
+ "361": "NUM|l-csubj",
395
+ "362": "NUM|l-dislocated",
396
+ "363": "NUM|l-nsubj",
397
+ "364": "NUM|l-nsubj:outer",
398
+ "365": "NUM|l-nummod",
399
+ "366": "NUM|l-obj",
400
+ "367": "NUM|l-obl",
401
+ "368": "NUM|l-obl:lmod",
402
+ "369": "NUM|l-obl:tmod",
403
+ "370": "NUM|r-ccomp",
404
+ "371": "NUM|r-clf",
405
+ "372": "NUM|r-compound",
406
+ "373": "NUM|r-compound:redup",
407
+ "374": "NUM|r-conj",
408
+ "375": "NUM|r-flat",
409
+ "376": "NUM|r-iobj",
410
+ "377": "NUM|r-list",
411
+ "378": "NUM|r-nummod",
412
+ "379": "NUM|r-obj",
413
+ "380": "NUM|r-obl",
414
+ "381": "NUM|r-obl:tmod",
415
+ "382": "NUM|r-parataxis",
416
+ "383": "NUM|r-xcomp",
417
+ "384": "NUM|root",
418
+ "385": "PART",
419
+ "386": "PART.",
420
+ "387": "PART|_",
421
+ "388": "PART|l-acl",
422
+ "389": "PART|l-advcl",
423
+ "390": "PART|l-advmod",
424
+ "391": "PART|l-amod",
425
+ "392": "PART|l-case",
426
+ "393": "PART|l-cc",
427
+ "394": "PART|l-csubj",
428
+ "395": "PART|l-csubj:outer",
429
+ "396": "PART|l-discourse",
430
+ "397": "PART|l-discourse:sp",
431
+ "398": "PART|l-dislocated",
432
+ "399": "PART|l-mark",
433
+ "400": "PART|l-nmod",
434
+ "401": "PART|l-nsubj",
435
+ "402": "PART|l-nsubj:outer",
436
+ "403": "PART|l-nsubj:pass",
437
+ "404": "PART|l-obj",
438
+ "405": "PART|l-obl",
439
+ "406": "PART|l-obl:lmod",
440
+ "407": "PART|r-advmod",
441
+ "408": "PART|r-case",
442
+ "409": "PART|r-ccomp",
443
+ "410": "PART|r-clf",
444
+ "411": "PART|r-conj",
445
+ "412": "PART|r-discourse",
446
+ "413": "PART|r-discourse:sp",
447
+ "414": "PART|r-dislocated",
448
+ "415": "PART|r-fixed",
449
+ "416": "PART|r-flat",
450
+ "417": "PART|r-iobj",
451
+ "418": "PART|r-list",
452
+ "419": "PART|r-mark",
453
+ "420": "PART|r-nsubj",
454
+ "421": "PART|r-obj",
455
+ "422": "PART|r-obl",
456
+ "423": "PART|r-parataxis",
457
+ "424": "PART|r-xcomp",
458
+ "425": "PART|root",
459
+ "426": "PRON",
460
+ "427": "PRON.",
461
+ "428": "PRON|Person=1|PronType=Prs|_",
462
+ "429": "PRON|Person=1|PronType=Prs|l-acl",
463
+ "430": "PRON|Person=1|PronType=Prs|l-advcl",
464
+ "431": "PRON|Person=1|PronType=Prs|l-det",
465
+ "432": "PRON|Person=1|PronType=Prs|l-iobj",
466
+ "433": "PRON|Person=1|PronType=Prs|l-nsubj",
467
+ "434": "PRON|Person=1|PronType=Prs|l-nsubj:outer",
468
+ "435": "PRON|Person=1|PronType=Prs|l-obj",
469
+ "436": "PRON|Person=1|PronType=Prs|l-obl",
470
+ "437": "PRON|Person=1|PronType=Prs|l-vocative",
471
+ "438": "PRON|Person=1|PronType=Prs|r-ccomp",
472
+ "439": "PRON|Person=1|PronType=Prs|r-conj",
473
+ "440": "PRON|Person=1|PronType=Prs|r-iobj",
474
+ "441": "PRON|Person=1|PronType=Prs|r-nsubj",
475
+ "442": "PRON|Person=1|PronType=Prs|r-obj",
476
+ "443": "PRON|Person=1|PronType=Prs|r-obl",
477
+ "444": "PRON|Person=1|PronType=Prs|r-obl:lmod",
478
+ "445": "PRON|Person=1|PronType=Prs|root",
479
+ "446": "PRON|Person=2|PronType=Prs|_",
480
+ "447": "PRON|Person=2|PronType=Prs|l-advcl",
481
+ "448": "PRON|Person=2|PronType=Prs|l-amod",
482
+ "449": "PRON|Person=2|PronType=Prs|l-det",
483
+ "450": "PRON|Person=2|PronType=Prs|l-nmod",
484
+ "451": "PRON|Person=2|PronType=Prs|l-nsubj",
485
+ "452": "PRON|Person=2|PronType=Prs|l-nsubj:outer",
486
+ "453": "PRON|Person=2|PronType=Prs|l-obj",
487
+ "454": "PRON|Person=2|PronType=Prs|l-obl",
488
+ "455": "PRON|Person=2|PronType=Prs|l-vocative",
489
+ "456": "PRON|Person=2|PronType=Prs|r-conj",
490
+ "457": "PRON|Person=2|PronType=Prs|r-flat",
491
+ "458": "PRON|Person=2|PronType=Prs|r-iobj",
492
+ "459": "PRON|Person=2|PronType=Prs|r-obj",
493
+ "460": "PRON|Person=2|PronType=Prs|r-obl",
494
+ "461": "PRON|Person=2|PronType=Prs|root",
495
+ "462": "PRON|Person=3|PronType=Prs|_",
496
+ "463": "PRON|Person=3|PronType=Prs|l-advcl",
497
+ "464": "PRON|Person=3|PronType=Prs|l-amod",
498
+ "465": "PRON|Person=3|PronType=Prs|l-det",
499
+ "466": "PRON|Person=3|PronType=Prs|l-dislocated",
500
+ "467": "PRON|Person=3|PronType=Prs|l-expl",
501
+ "468": "PRON|Person=3|PronType=Prs|l-iobj",
502
+ "469": "PRON|Person=3|PronType=Prs|l-nsubj",
503
+ "470": "PRON|Person=3|PronType=Prs|l-nsubj:outer",
504
+ "471": "PRON|Person=3|PronType=Prs|l-nsubj:pass",
505
+ "472": "PRON|Person=3|PronType=Prs|l-obj",
506
+ "473": "PRON|Person=3|PronType=Prs|l-obl",
507
+ "474": "PRON|Person=3|PronType=Prs|r-ccomp",
508
+ "475": "PRON|Person=3|PronType=Prs|r-conj",
509
+ "476": "PRON|Person=3|PronType=Prs|r-expl",
510
+ "477": "PRON|Person=3|PronType=Prs|r-iobj",
511
+ "478": "PRON|Person=3|PronType=Prs|r-nsubj",
512
+ "479": "PRON|Person=3|PronType=Prs|r-obj",
513
+ "480": "PRON|Person=3|PronType=Prs|r-obl",
514
+ "481": "PRON|Person=3|PronType=Prs|root",
515
+ "482": "PRON|PronType=Dem|_",
516
+ "483": "PRON|PronType=Dem|l-acl",
517
+ "484": "PRON|PronType=Dem|l-advcl",
518
+ "485": "PRON|PronType=Dem|l-amod",
519
+ "486": "PRON|PronType=Dem|l-compound",
520
+ "487": "PRON|PronType=Dem|l-det",
521
+ "488": "PRON|PronType=Dem|l-dislocated",
522
+ "489": "PRON|PronType=Dem|l-expl",
523
+ "490": "PRON|PronType=Dem|l-nsubj",
524
+ "491": "PRON|PronType=Dem|l-nsubj:outer",
525
+ "492": "PRON|PronType=Dem|l-obj",
526
+ "493": "PRON|PronType=Dem|l-obl",
527
+ "494": "PRON|PronType=Dem|l-obl:lmod",
528
+ "495": "PRON|PronType=Dem|r-conj",
529
+ "496": "PRON|PronType=Dem|r-det",
530
+ "497": "PRON|PronType=Dem|r-expl",
531
+ "498": "PRON|PronType=Dem|r-flat",
532
+ "499": "PRON|PronType=Dem|r-iobj",
533
+ "500": "PRON|PronType=Dem|r-obj",
534
+ "501": "PRON|PronType=Dem|r-obl",
535
+ "502": "PRON|PronType=Dem|r-obl:lmod",
536
+ "503": "PRON|PronType=Dem|root",
537
+ "504": "PRON|PronType=Int|_",
538
+ "505": "PRON|PronType=Int|l-advcl",
539
+ "506": "PRON|PronType=Int|l-amod",
540
+ "507": "PRON|PronType=Int|l-det",
541
+ "508": "PRON|PronType=Int|l-dislocated",
542
+ "509": "PRON|PronType=Int|l-nsubj",
543
+ "510": "PRON|PronType=Int|l-nsubj:outer",
544
+ "511": "PRON|PronType=Int|l-obj",
545
+ "512": "PRON|PronType=Int|l-obl",
546
+ "513": "PRON|PronType=Int|l-vocative",
547
+ "514": "PRON|PronType=Int|r-ccomp",
548
+ "515": "PRON|PronType=Int|r-conj",
549
+ "516": "PRON|PronType=Int|r-flat",
550
+ "517": "PRON|PronType=Int|r-obj",
551
+ "518": "PRON|PronType=Int|r-parataxis",
552
+ "519": "PRON|PronType=Int|r-xcomp",
553
+ "520": "PRON|PronType=Int|root",
554
+ "521": "PRON|PronType=Prs|Reflex=Yes|_",
555
+ "522": "PRON|PronType=Prs|Reflex=Yes|l-acl",
556
+ "523": "PRON|PronType=Prs|Reflex=Yes|l-det",
557
+ "524": "PRON|PronType=Prs|Reflex=Yes|l-nsubj",
558
+ "525": "PRON|PronType=Prs|Reflex=Yes|l-obj",
559
+ "526": "PRON|PronType=Prs|Reflex=Yes|l-obl",
560
+ "527": "PRON|PronType=Prs|Reflex=Yes|r-dislocated",
561
+ "528": "PRON|PronType=Prs|Reflex=Yes|r-obj",
562
+ "529": "PRON|PronType=Prs|Reflex=Yes|r-obl",
563
+ "530": "PRON|PronType=Prs|Reflex=Yes|root",
564
+ "531": "PRON|PronType=Prs|_",
565
+ "532": "PRON|PronType=Prs|l-det",
566
+ "533": "PRON|PronType=Prs|l-nsubj",
567
+ "534": "PRON|PronType=Prs|l-nsubj:outer",
568
+ "535": "PRON|PronType=Prs|l-obj",
569
+ "536": "PRON|PronType=Prs|r-conj",
570
+ "537": "PRON|PronType=Prs|r-iobj",
571
+ "538": "PRON|PronType=Prs|r-obj",
572
+ "539": "PROPN",
573
+ "540": "PROPN.",
574
+ "541": "PROPN|Case=Loc|NameType=Geo|_",
575
+ "542": "PROPN|Case=Loc|NameType=Geo|l-acl",
576
+ "543": "PROPN|Case=Loc|NameType=Geo|l-advcl",
577
+ "544": "PROPN|Case=Loc|NameType=Geo|l-amod",
578
+ "545": "PROPN|Case=Loc|NameType=Geo|l-compound",
579
+ "546": "PROPN|Case=Loc|NameType=Geo|l-csubj",
580
+ "547": "PROPN|Case=Loc|NameType=Geo|l-dislocated",
581
+ "548": "PROPN|Case=Loc|NameType=Geo|l-nmod",
582
+ "549": "PROPN|Case=Loc|NameType=Geo|l-nsubj",
583
+ "550": "PROPN|Case=Loc|NameType=Geo|l-nsubj:outer",
584
+ "551": "PROPN|Case=Loc|NameType=Geo|l-obl",
585
+ "552": "PROPN|Case=Loc|NameType=Geo|l-obl:lmod",
586
+ "553": "PROPN|Case=Loc|NameType=Geo|r-conj",
587
+ "554": "PROPN|Case=Loc|NameType=Geo|r-flat",
588
+ "555": "PROPN|Case=Loc|NameType=Geo|r-iobj",
589
+ "556": "PROPN|Case=Loc|NameType=Geo|r-obj",
590
+ "557": "PROPN|Case=Loc|NameType=Geo|r-obl",
591
+ "558": "PROPN|Case=Loc|NameType=Geo|r-obl:lmod",
592
+ "559": "PROPN|Case=Loc|NameType=Geo|r-parataxis",
593
+ "560": "PROPN|Case=Loc|NameType=Geo|r-xcomp",
594
+ "561": "PROPN|Case=Loc|NameType=Geo|root",
595
+ "562": "PROPN|Case=Loc|NameType=Nat|_",
596
+ "563": "PROPN|Case=Loc|NameType=Nat|l-acl",
597
+ "564": "PROPN|Case=Loc|NameType=Nat|l-advcl",
598
+ "565": "PROPN|Case=Loc|NameType=Nat|l-amod",
599
+ "566": "PROPN|Case=Loc|NameType=Nat|l-clf",
600
+ "567": "PROPN|Case=Loc|NameType=Nat|l-compound",
601
+ "568": "PROPN|Case=Loc|NameType=Nat|l-nmod",
602
+ "569": "PROPN|Case=Loc|NameType=Nat|l-nsubj",
603
+ "570": "PROPN|Case=Loc|NameType=Nat|l-nsubj:outer",
604
+ "571": "PROPN|Case=Loc|NameType=Nat|l-nsubj:pass",
605
+ "572": "PROPN|Case=Loc|NameType=Nat|l-obj",
606
+ "573": "PROPN|Case=Loc|NameType=Nat|l-obl",
607
+ "574": "PROPN|Case=Loc|NameType=Nat|l-obl:lmod",
608
+ "575": "PROPN|Case=Loc|NameType=Nat|r-ccomp",
609
+ "576": "PROPN|Case=Loc|NameType=Nat|r-conj",
610
+ "577": "PROPN|Case=Loc|NameType=Nat|r-flat",
611
+ "578": "PROPN|Case=Loc|NameType=Nat|r-iobj",
612
+ "579": "PROPN|Case=Loc|NameType=Nat|r-nmod",
613
+ "580": "PROPN|Case=Loc|NameType=Nat|r-obj",
614
+ "581": "PROPN|Case=Loc|NameType=Nat|r-obl",
615
+ "582": "PROPN|Case=Loc|NameType=Nat|r-obl:lmod",
616
+ "583": "PROPN|Case=Loc|NameType=Nat|r-parataxis",
617
+ "584": "PROPN|Case=Loc|NameType=Nat|r-xcomp",
618
+ "585": "PROPN|Case=Loc|NameType=Nat|root",
619
+ "586": "PROPN|NameType=Giv|_",
620
+ "587": "PROPN|NameType=Giv|l-acl",
621
+ "588": "PROPN|NameType=Giv|l-advcl",
622
+ "589": "PROPN|NameType=Giv|l-amod",
623
+ "590": "PROPN|NameType=Giv|l-compound",
624
+ "591": "PROPN|NameType=Giv|l-dislocated",
625
+ "592": "PROPN|NameType=Giv|l-nmod",
626
+ "593": "PROPN|NameType=Giv|l-nsubj",
627
+ "594": "PROPN|NameType=Giv|l-nsubj:outer",
628
+ "595": "PROPN|NameType=Giv|l-nsubj:pass",
629
+ "596": "PROPN|NameType=Giv|l-obj",
630
+ "597": "PROPN|NameType=Giv|l-obl",
631
+ "598": "PROPN|NameType=Giv|l-obl:lmod",
632
+ "599": "PROPN|NameType=Giv|l-parataxis",
633
+ "600": "PROPN|NameType=Giv|l-vocative",
634
+ "601": "PROPN|NameType=Giv|r-appos",
635
+ "602": "PROPN|NameType=Giv|r-ccomp",
636
+ "603": "PROPN|NameType=Giv|r-conj",
637
+ "604": "PROPN|NameType=Giv|r-dislocated",
638
+ "605": "PROPN|NameType=Giv|r-flat",
639
+ "606": "PROPN|NameType=Giv|r-iobj",
640
+ "607": "PROPN|NameType=Giv|r-list",
641
+ "608": "PROPN|NameType=Giv|r-nmod",
642
+ "609": "PROPN|NameType=Giv|r-obj",
643
+ "610": "PROPN|NameType=Giv|r-obl",
644
+ "611": "PROPN|NameType=Giv|r-obl:lmod",
645
+ "612": "PROPN|NameType=Giv|r-parataxis",
646
+ "613": "PROPN|NameType=Giv|r-xcomp",
647
+ "614": "PROPN|NameType=Giv|root",
648
+ "615": "PROPN|NameType=Prs|_",
649
+ "616": "PROPN|NameType=Prs|l-acl",
650
+ "617": "PROPN|NameType=Prs|l-advcl",
651
+ "618": "PROPN|NameType=Prs|l-amod",
652
+ "619": "PROPN|NameType=Prs|l-compound",
653
+ "620": "PROPN|NameType=Prs|l-dislocated",
654
+ "621": "PROPN|NameType=Prs|l-nmod",
655
+ "622": "PROPN|NameType=Prs|l-nsubj",
656
+ "623": "PROPN|NameType=Prs|l-nsubj:outer",
657
+ "624": "PROPN|NameType=Prs|l-obj",
658
+ "625": "PROPN|NameType=Prs|l-obl",
659
+ "626": "PROPN|NameType=Prs|r-conj",
660
+ "627": "PROPN|NameType=Prs|r-dislocated",
661
+ "628": "PROPN|NameType=Prs|r-flat",
662
+ "629": "PROPN|NameType=Prs|r-iobj",
663
+ "630": "PROPN|NameType=Prs|r-obj",
664
+ "631": "PROPN|NameType=Prs|r-obl",
665
+ "632": "PROPN|NameType=Prs|r-parataxis",
666
+ "633": "PROPN|NameType=Prs|root",
667
+ "634": "PROPN|NameType=Sur|_",
668
+ "635": "PROPN|NameType=Sur|l-acl",
669
+ "636": "PROPN|NameType=Sur|l-advcl",
670
+ "637": "PROPN|NameType=Sur|l-amod",
671
+ "638": "PROPN|NameType=Sur|l-compound",
672
+ "639": "PROPN|NameType=Sur|l-csubj",
673
+ "640": "PROPN|NameType=Sur|l-dislocated",
674
+ "641": "PROPN|NameType=Sur|l-nmod",
675
+ "642": "PROPN|NameType=Sur|l-nsubj",
676
+ "643": "PROPN|NameType=Sur|l-nsubj:outer",
677
+ "644": "PROPN|NameType=Sur|l-nsubj:pass",
678
+ "645": "PROPN|NameType=Sur|l-obl",
679
+ "646": "PROPN|NameType=Sur|l-obl:lmod",
680
+ "647": "PROPN|NameType=Sur|l-vocative",
681
+ "648": "PROPN|NameType=Sur|r-ccomp",
682
+ "649": "PROPN|NameType=Sur|r-conj",
683
+ "650": "PROPN|NameType=Sur|r-dislocated",
684
+ "651": "PROPN|NameType=Sur|r-flat",
685
+ "652": "PROPN|NameType=Sur|r-iobj",
686
+ "653": "PROPN|NameType=Sur|r-list",
687
+ "654": "PROPN|NameType=Sur|r-nmod",
688
+ "655": "PROPN|NameType=Sur|r-nsubj",
689
+ "656": "PROPN|NameType=Sur|r-obj",
690
+ "657": "PROPN|NameType=Sur|r-obl",
691
+ "658": "PROPN|NameType=Sur|r-obl:lmod",
692
+ "659": "PROPN|NameType=Sur|r-parataxis",
693
+ "660": "PROPN|NameType=Sur|r-xcomp",
694
+ "661": "PROPN|NameType=Sur|root",
695
+ "662": "PROPN|_",
696
+ "663": "PROPN|l-nmod",
697
+ "664": "PUNCT",
698
+ "665": "PUNCT.",
699
+ "666": "PUNCT|_",
700
+ "667": "PUNCT|root",
701
+ "668": "SCONJ",
702
+ "669": "SCONJ.",
703
+ "670": "SCONJ|_",
704
+ "671": "SCONJ|l-case",
705
+ "672": "SCONJ|l-cc",
706
+ "673": "SCONJ|l-mark",
707
+ "674": "SCONJ|l-nsubj",
708
+ "675": "SCONJ|l-obl",
709
+ "676": "SCONJ|r-case",
710
+ "677": "SCONJ|r-iobj",
711
+ "678": "SCONJ|r-mark",
712
+ "679": "SCONJ|r-nsubj",
713
+ "680": "SCONJ|r-nsubj:pass",
714
+ "681": "SCONJ|r-obj",
715
+ "682": "SCONJ|root",
716
+ "683": "SYM",
717
+ "684": "SYM.",
718
+ "685": "SYM|_",
719
+ "686": "SYM|l-nmod",
720
+ "687": "SYM|l-nsubj",
721
+ "688": "SYM|r-conj",
722
+ "689": "SYM|r-nmod",
723
+ "690": "SYM|r-xcomp",
724
+ "691": "SYM|root",
725
+ "692": "VERB",
726
+ "693": "VERB.",
727
+ "694": "VERB|Degree=Equ|VerbForm=Part|_",
728
+ "695": "VERB|Degree=Equ|VerbForm=Part|l-amod",
729
+ "696": "VERB|Degree=Equ|_",
730
+ "697": "VERB|Degree=Equ|l-acl",
731
+ "698": "VERB|Degree=Equ|l-advcl",
732
+ "699": "VERB|Degree=Equ|l-ccomp",
733
+ "700": "VERB|Degree=Equ|l-csubj",
734
+ "701": "VERB|Degree=Equ|l-nsubj",
735
+ "702": "VERB|Degree=Equ|l-obj",
736
+ "703": "VERB|Degree=Equ|r-ccomp",
737
+ "704": "VERB|Degree=Equ|r-compound:redup",
738
+ "705": "VERB|Degree=Equ|r-conj",
739
+ "706": "VERB|Degree=Equ|r-obj",
740
+ "707": "VERB|Degree=Equ|r-parataxis",
741
+ "708": "VERB|Degree=Equ|r-xcomp",
742
+ "709": "VERB|Degree=Equ|root",
743
+ "710": "VERB|Degree=Pos|VerbForm=Part|_",
744
+ "711": "VERB|Degree=Pos|VerbForm=Part|l-amod",
745
+ "712": "VERB|Degree=Pos|VerbForm=Part|r-amod",
746
+ "713": "VERB|Degree=Pos|_",
747
+ "714": "VERB|Degree=Pos|l-acl",
748
+ "715": "VERB|Degree=Pos|l-advcl",
749
+ "716": "VERB|Degree=Pos|l-ccomp",
750
+ "717": "VERB|Degree=Pos|l-csubj",
751
+ "718": "VERB|Degree=Pos|l-csubj:outer",
752
+ "719": "VERB|Degree=Pos|l-dislocated",
753
+ "720": "VERB|Degree=Pos|l-nsubj",
754
+ "721": "VERB|Degree=Pos|l-nsubj:outer",
755
+ "722": "VERB|Degree=Pos|l-obj",
756
+ "723": "VERB|Degree=Pos|l-obl",
757
+ "724": "VERB|Degree=Pos|l-vocative",
758
+ "725": "VERB|Degree=Pos|r-advcl",
759
+ "726": "VERB|Degree=Pos|r-ccomp",
760
+ "727": "VERB|Degree=Pos|r-compound:redup",
761
+ "728": "VERB|Degree=Pos|r-conj",
762
+ "729": "VERB|Degree=Pos|r-dislocated",
763
+ "730": "VERB|Degree=Pos|r-fixed",
764
+ "731": "VERB|Degree=Pos|r-flat:vv",
765
+ "732": "VERB|Degree=Pos|r-iobj",
766
+ "733": "VERB|Degree=Pos|r-obj",
767
+ "734": "VERB|Degree=Pos|r-obl",
768
+ "735": "VERB|Degree=Pos|r-parataxis",
769
+ "736": "VERB|Degree=Pos|r-xcomp",
770
+ "737": "VERB|Degree=Pos|root",
771
+ "738": "VERB|Polarity=Neg|VerbForm=Part|_",
772
+ "739": "VERB|Polarity=Neg|VerbForm=Part|l-amod",
773
+ "740": "VERB|Polarity=Neg|_",
774
+ "741": "VERB|Polarity=Neg|l-acl",
775
+ "742": "VERB|Polarity=Neg|l-advcl",
776
+ "743": "VERB|Polarity=Neg|l-ccomp",
777
+ "744": "VERB|Polarity=Neg|l-csubj",
778
+ "745": "VERB|Polarity=Neg|l-csubj:outer",
779
+ "746": "VERB|Polarity=Neg|l-nsubj",
780
+ "747": "VERB|Polarity=Neg|l-obl",
781
+ "748": "VERB|Polarity=Neg|r-advcl",
782
+ "749": "VERB|Polarity=Neg|r-ccomp",
783
+ "750": "VERB|Polarity=Neg|r-conj",
784
+ "751": "VERB|Polarity=Neg|r-flat:vv",
785
+ "752": "VERB|Polarity=Neg|r-obj",
786
+ "753": "VERB|Polarity=Neg|r-obl",
787
+ "754": "VERB|Polarity=Neg|r-parataxis",
788
+ "755": "VERB|Polarity=Neg|r-xcomp",
789
+ "756": "VERB|Polarity=Neg|root",
790
+ "757": "VERB|VerbForm=Part|_",
791
+ "758": "VERB|VerbForm=Part|l-amod",
792
+ "759": "VERB|VerbForm=Part|r-amod",
793
+ "760": "VERB|_",
794
+ "761": "VERB|l-acl",
795
+ "762": "VERB|l-advcl",
796
+ "763": "VERB|l-ccomp",
797
+ "764": "VERB|l-csubj",
798
+ "765": "VERB|l-csubj:outer",
799
+ "766": "VERB|l-csubj:pass",
800
+ "767": "VERB|l-dislocated",
801
+ "768": "VERB|l-nsubj",
802
+ "769": "VERB|l-nsubj:outer",
803
+ "770": "VERB|l-obj",
804
+ "771": "VERB|l-obl",
805
+ "772": "VERB|l-obl:lmod",
806
+ "773": "VERB|l-parataxis",
807
+ "774": "VERB|r-acl",
808
+ "775": "VERB|r-advcl",
809
+ "776": "VERB|r-ccomp",
810
+ "777": "VERB|r-compound:redup",
811
+ "778": "VERB|r-conj",
812
+ "779": "VERB|r-dislocated",
813
+ "780": "VERB|r-fixed",
814
+ "781": "VERB|r-flat:vv",
815
+ "782": "VERB|r-iobj",
816
+ "783": "VERB|r-list",
817
+ "784": "VERB|r-obj",
818
+ "785": "VERB|r-obl",
819
+ "786": "VERB|r-obl:lmod",
820
+ "787": "VERB|r-parataxis",
821
+ "788": "VERB|r-vocative",
822
+ "789": "VERB|r-xcomp",
823
+ "790": "VERB|root"
824
+ },
825
+ "initializer_cutoff_factor": 2.0,
826
+ "initializer_range": 0.02,
827
+ "intermediate_size": 2624,
828
+ "label2id": {
829
+ "ADP": 0,
830
+ "ADP.": 1,
831
+ "ADP|Degree=Equ|_": 2,
832
+ "ADP|Degree=Equ|l-cc": 3,
833
+ "ADP|_": 4,
834
+ "ADP|l-acl": 5,
835
+ "ADP|l-advcl": 6,
836
+ "ADP|l-amod": 7,
837
+ "ADP|l-case": 8,
838
+ "ADP|l-cc": 9,
839
+ "ADP|l-mark": 10,
840
+ "ADP|l-nsubj": 11,
841
+ "ADP|l-obl": 12,
842
+ "ADP|r-case": 13,
843
+ "ADP|r-conj": 14,
844
+ "ADP|r-fixed": 15,
845
+ "ADP|r-mark": 16,
846
+ "ADP|r-obj": 17,
847
+ "ADP|root": 18,
848
+ "ADV": 19,
849
+ "ADV.": 20,
850
+ "ADV|AdvType=Cau|_": 21,
851
+ "ADV|AdvType=Cau|l-advmod": 22,
852
+ "ADV|AdvType=Cau|l-amod": 23,
853
+ "ADV|AdvType=Cau|l-nsubj": 24,
854
+ "ADV|AdvType=Cau|l-obj": 25,
855
+ "ADV|AdvType=Deg|Degree=Cmp|_": 26,
856
+ "ADV|AdvType=Deg|Degree=Cmp|l-advmod": 27,
857
+ "ADV|AdvType=Deg|Degree=Cmp|l-amod": 28,
858
+ "ADV|AdvType=Deg|Degree=Cmp|r-conj": 29,
859
+ "ADV|AdvType=Deg|Degree=Cmp|r-obj": 30,
860
+ "ADV|AdvType=Deg|Degree=Pos|_": 31,
861
+ "ADV|AdvType=Deg|Degree=Pos|l-advmod": 32,
862
+ "ADV|AdvType=Deg|Degree=Pos|l-amod": 33,
863
+ "ADV|AdvType=Deg|Degree=Pos|r-ccomp": 34,
864
+ "ADV|AdvType=Deg|Degree=Pos|r-conj": 35,
865
+ "ADV|AdvType=Deg|Degree=Pos|r-flat:vv": 36,
866
+ "ADV|AdvType=Deg|Degree=Pos|r-parataxis": 37,
867
+ "ADV|AdvType=Deg|Degree=Pos|root": 38,
868
+ "ADV|AdvType=Deg|Degree=Sup|_": 39,
869
+ "ADV|AdvType=Deg|Degree=Sup|l-advmod": 40,
870
+ "ADV|AdvType=Deg|Degree=Sup|l-amod": 41,
871
+ "ADV|AdvType=Deg|Degree=Sup|l-nsubj": 42,
872
+ "ADV|AdvType=Deg|Degree=Sup|r-conj": 43,
873
+ "ADV|AdvType=Deg|Degree=Sup|r-parataxis": 44,
874
+ "ADV|AdvType=Deg|Degree=Sup|root": 45,
875
+ "ADV|AdvType=Tim|Aspect=Perf|_": 46,
876
+ "ADV|AdvType=Tim|Aspect=Perf|l-advmod": 47,
877
+ "ADV|AdvType=Tim|Aspect=Perf|l-amod": 48,
878
+ "ADV|AdvType=Tim|Aspect=Perf|l-obl:lmod": 49,
879
+ "ADV|AdvType=Tim|Aspect=Perf|r-parataxis": 50,
880
+ "ADV|AdvType=Tim|Aspect=Perf|root": 51,
881
+ "ADV|AdvType=Tim|Tense=Fut|_": 52,
882
+ "ADV|AdvType=Tim|Tense=Fut|l-advmod": 53,
883
+ "ADV|AdvType=Tim|Tense=Fut|l-amod": 54,
884
+ "ADV|AdvType=Tim|Tense=Fut|l-nsubj": 55,
885
+ "ADV|AdvType=Tim|Tense=Fut|l-nsubj:outer": 56,
886
+ "ADV|AdvType=Tim|Tense=Fut|root": 57,
887
+ "ADV|AdvType=Tim|Tense=Past|_": 58,
888
+ "ADV|AdvType=Tim|Tense=Past|l-advmod": 59,
889
+ "ADV|AdvType=Tim|Tense=Past|l-amod": 60,
890
+ "ADV|AdvType=Tim|Tense=Pres|_": 61,
891
+ "ADV|AdvType=Tim|Tense=Pres|l-advmod": 62,
892
+ "ADV|AdvType=Tim|Tense=Pres|l-amod": 63,
893
+ "ADV|AdvType=Tim|Tense=Pres|root": 64,
894
+ "ADV|AdvType=Tim|_": 65,
895
+ "ADV|AdvType=Tim|l-advcl": 66,
896
+ "ADV|AdvType=Tim|l-advmod": 67,
897
+ "ADV|AdvType=Tim|l-amod": 68,
898
+ "ADV|AdvType=Tim|l-nsubj": 69,
899
+ "ADV|AdvType=Tim|r-advmod": 70,
900
+ "ADV|AdvType=Tim|r-ccomp": 71,
901
+ "ADV|AdvType=Tim|r-compound:redup": 72,
902
+ "ADV|AdvType=Tim|r-conj": 73,
903
+ "ADV|AdvType=Tim|r-flat:vv": 74,
904
+ "ADV|AdvType=Tim|r-parataxis": 75,
905
+ "ADV|AdvType=Tim|root": 76,
906
+ "ADV|Degree=Equ|VerbForm=Conv|_": 77,
907
+ "ADV|Degree=Equ|VerbForm=Conv|l-advmod": 78,
908
+ "ADV|Degree=Pos|VerbForm=Conv|_": 79,
909
+ "ADV|Degree=Pos|VerbForm=Conv|l-advmod": 80,
910
+ "ADV|Degree=Pos|VerbForm=Conv|r-advmod": 81,
911
+ "ADV|Polarity=Neg|VerbForm=Conv|_": 82,
912
+ "ADV|Polarity=Neg|VerbForm=Conv|l-advmod": 83,
913
+ "ADV|Polarity=Neg|_": 84,
914
+ "ADV|Polarity=Neg|l-advmod": 85,
915
+ "ADV|Polarity=Neg|l-amod": 86,
916
+ "ADV|Polarity=Neg|l-nsubj": 87,
917
+ "ADV|Polarity=Neg|l-parataxis": 88,
918
+ "ADV|Polarity=Neg|r-advmod": 89,
919
+ "ADV|Polarity=Neg|r-conj": 90,
920
+ "ADV|Polarity=Neg|r-obj": 91,
921
+ "ADV|Polarity=Neg|r-parataxis": 92,
922
+ "ADV|Polarity=Neg|root": 93,
923
+ "ADV|VerbForm=Conv|_": 94,
924
+ "ADV|VerbForm=Conv|l-advmod": 95,
925
+ "ADV|VerbForm=Conv|r-advmod": 96,
926
+ "ADV|_": 97,
927
+ "ADV|l-acl": 98,
928
+ "ADV|l-advcl": 99,
929
+ "ADV|l-advmod": 100,
930
+ "ADV|l-amod": 101,
931
+ "ADV|l-cc": 102,
932
+ "ADV|l-nsubj": 103,
933
+ "ADV|r-advmod": 104,
934
+ "ADV|r-ccomp": 105,
935
+ "ADV|r-conj": 106,
936
+ "ADV|r-flat:vv": 107,
937
+ "ADV|r-obj": 108,
938
+ "ADV|root": 109,
939
+ "AUX": 110,
940
+ "AUX.": 111,
941
+ "AUX|Mood=Des|_": 112,
942
+ "AUX|Mood=Des|l-aux": 113,
943
+ "AUX|Mood=Des|l-csubj": 114,
944
+ "AUX|Mood=Des|l-parataxis": 115,
945
+ "AUX|Mood=Des|r-ccomp": 116,
946
+ "AUX|Mood=Des|r-conj": 117,
947
+ "AUX|Mood=Des|r-flat:vv": 118,
948
+ "AUX|Mood=Des|root": 119,
949
+ "AUX|Mood=Nec|_": 120,
950
+ "AUX|Mood=Nec|l-acl": 121,
951
+ "AUX|Mood=Nec|l-amod": 122,
952
+ "AUX|Mood=Nec|l-aux": 123,
953
+ "AUX|Mood=Nec|r-aux": 124,
954
+ "AUX|Mood=Nec|root": 125,
955
+ "AUX|Mood=Pot|_": 126,
956
+ "AUX|Mood=Pot|l-acl": 127,
957
+ "AUX|Mood=Pot|l-advcl": 128,
958
+ "AUX|Mood=Pot|l-amod": 129,
959
+ "AUX|Mood=Pot|l-aux": 130,
960
+ "AUX|Mood=Pot|l-csubj": 131,
961
+ "AUX|Mood=Pot|l-nsubj": 132,
962
+ "AUX|Mood=Pot|r-ccomp": 133,
963
+ "AUX|Mood=Pot|r-conj": 134,
964
+ "AUX|Mood=Pot|r-obj": 135,
965
+ "AUX|Mood=Pot|r-parataxis": 136,
966
+ "AUX|Mood=Pot|r-xcomp": 137,
967
+ "AUX|Mood=Pot|root": 138,
968
+ "AUX|VerbType=Cop|_": 139,
969
+ "AUX|VerbType=Cop|l-cop": 140,
970
+ "AUX|Voice=Pass|_": 141,
971
+ "AUX|Voice=Pass|l-aux": 142,
972
+ "AUX|Voice=Pass|r-conj": 143,
973
+ "AUX|Voice=Pass|root": 144,
974
+ "B-ADP": 145,
975
+ "B-ADP.": 146,
976
+ "B-ADV": 147,
977
+ "B-ADV.": 148,
978
+ "B-AUX": 149,
979
+ "B-AUX.": 150,
980
+ "B-CCONJ": 151,
981
+ "B-CCONJ.": 152,
982
+ "B-INTJ": 153,
983
+ "B-INTJ.": 154,
984
+ "B-NOUN": 155,
985
+ "B-NOUN.": 156,
986
+ "B-NUM": 157,
987
+ "B-NUM.": 158,
988
+ "B-PART": 159,
989
+ "B-PART.": 160,
990
+ "B-PRON": 161,
991
+ "B-PRON.": 162,
992
+ "B-PROPN": 163,
993
+ "B-PROPN.": 164,
994
+ "B-PUNCT": 165,
995
+ "B-PUNCT.": 166,
996
+ "B-SCONJ": 167,
997
+ "B-SCONJ.": 168,
998
+ "B-SYM": 169,
999
+ "B-SYM.": 170,
1000
+ "B-VERB": 171,
1001
+ "B-VERB.": 172,
1002
+ "CCONJ": 173,
1003
+ "CCONJ.": 174,
1004
+ "CCONJ|_": 175,
1005
+ "CCONJ|l-advmod": 176,
1006
+ "CCONJ|l-amod": 177,
1007
+ "CCONJ|l-cc": 178,
1008
+ "CCONJ|l-obj": 179,
1009
+ "CCONJ|r-fixed": 180,
1010
+ "CCONJ|r-orphan": 181,
1011
+ "I-ADP": 182,
1012
+ "I-ADP.": 183,
1013
+ "I-ADV": 184,
1014
+ "I-ADV.": 185,
1015
+ "I-AUX": 186,
1016
+ "I-AUX.": 187,
1017
+ "I-CCONJ": 188,
1018
+ "I-CCONJ.": 189,
1019
+ "I-INTJ": 190,
1020
+ "I-INTJ.": 191,
1021
+ "I-NOUN": 192,
1022
+ "I-NOUN.": 193,
1023
+ "I-NUM": 194,
1024
+ "I-NUM.": 195,
1025
+ "I-PART": 196,
1026
+ "I-PART.": 197,
1027
+ "I-PRON": 198,
1028
+ "I-PRON.": 199,
1029
+ "I-PROPN": 200,
1030
+ "I-PROPN.": 201,
1031
+ "I-PUNCT": 202,
1032
+ "I-PUNCT.": 203,
1033
+ "I-SCONJ": 204,
1034
+ "I-SCONJ.": 205,
1035
+ "I-SYM": 206,
1036
+ "I-SYM.": 207,
1037
+ "I-VERB": 208,
1038
+ "I-VERB.": 209,
1039
+ "INTJ": 210,
1040
+ "INTJ.": 211,
1041
+ "INTJ|_": 212,
1042
+ "INTJ|l-advcl": 213,
1043
+ "INTJ|l-csubj": 214,
1044
+ "INTJ|l-discourse": 215,
1045
+ "INTJ|l-discourse:sp": 216,
1046
+ "INTJ|l-dislocated": 217,
1047
+ "INTJ|l-nsubj": 218,
1048
+ "INTJ|l-vocative": 219,
1049
+ "INTJ|r-compound:redup": 220,
1050
+ "INTJ|r-conj": 221,
1051
+ "INTJ|r-discourse:sp": 222,
1052
+ "INTJ|r-dislocated": 223,
1053
+ "INTJ|r-fixed": 224,
1054
+ "INTJ|r-obj": 225,
1055
+ "INTJ|r-parataxis": 226,
1056
+ "INTJ|root": 227,
1057
+ "NOUN": 228,
1058
+ "NOUN.": 229,
1059
+ "NOUN|Case=Loc|_": 230,
1060
+ "NOUN|Case=Loc|l-acl": 231,
1061
+ "NOUN|Case=Loc|l-advcl": 232,
1062
+ "NOUN|Case=Loc|l-amod": 233,
1063
+ "NOUN|Case=Loc|l-clf": 234,
1064
+ "NOUN|Case=Loc|l-compound": 235,
1065
+ "NOUN|Case=Loc|l-csubj": 236,
1066
+ "NOUN|Case=Loc|l-dislocated": 237,
1067
+ "NOUN|Case=Loc|l-nmod": 238,
1068
+ "NOUN|Case=Loc|l-nsubj": 239,
1069
+ "NOUN|Case=Loc|l-nsubj:outer": 240,
1070
+ "NOUN|Case=Loc|l-obj": 241,
1071
+ "NOUN|Case=Loc|l-obl": 242,
1072
+ "NOUN|Case=Loc|l-obl:lmod": 243,
1073
+ "NOUN|Case=Loc|l-obl:tmod": 244,
1074
+ "NOUN|Case=Loc|l-parataxis": 245,
1075
+ "NOUN|Case=Loc|r-ccomp": 246,
1076
+ "NOUN|Case=Loc|r-clf": 247,
1077
+ "NOUN|Case=Loc|r-compound:redup": 248,
1078
+ "NOUN|Case=Loc|r-conj": 249,
1079
+ "NOUN|Case=Loc|r-dislocated": 250,
1080
+ "NOUN|Case=Loc|r-flat": 251,
1081
+ "NOUN|Case=Loc|r-iobj": 252,
1082
+ "NOUN|Case=Loc|r-list": 253,
1083
+ "NOUN|Case=Loc|r-nmod": 254,
1084
+ "NOUN|Case=Loc|r-nsubj": 255,
1085
+ "NOUN|Case=Loc|r-obj": 256,
1086
+ "NOUN|Case=Loc|r-obl": 257,
1087
+ "NOUN|Case=Loc|r-obl:lmod": 258,
1088
+ "NOUN|Case=Loc|r-parataxis": 259,
1089
+ "NOUN|Case=Loc|r-xcomp": 260,
1090
+ "NOUN|Case=Loc|root": 261,
1091
+ "NOUN|Case=Tem|_": 262,
1092
+ "NOUN|Case=Tem|l-acl": 263,
1093
+ "NOUN|Case=Tem|l-advcl": 264,
1094
+ "NOUN|Case=Tem|l-amod": 265,
1095
+ "NOUN|Case=Tem|l-compound": 266,
1096
+ "NOUN|Case=Tem|l-csubj": 267,
1097
+ "NOUN|Case=Tem|l-nmod": 268,
1098
+ "NOUN|Case=Tem|l-nsubj": 269,
1099
+ "NOUN|Case=Tem|l-nsubj:outer": 270,
1100
+ "NOUN|Case=Tem|l-obj": 271,
1101
+ "NOUN|Case=Tem|l-obl:tmod": 272,
1102
+ "NOUN|Case=Tem|r-amod": 273,
1103
+ "NOUN|Case=Tem|r-ccomp": 274,
1104
+ "NOUN|Case=Tem|r-clf": 275,
1105
+ "NOUN|Case=Tem|r-compound:redup": 276,
1106
+ "NOUN|Case=Tem|r-conj": 277,
1107
+ "NOUN|Case=Tem|r-flat": 278,
1108
+ "NOUN|Case=Tem|r-iobj": 279,
1109
+ "NOUN|Case=Tem|r-list": 280,
1110
+ "NOUN|Case=Tem|r-nsubj": 281,
1111
+ "NOUN|Case=Tem|r-obj": 282,
1112
+ "NOUN|Case=Tem|r-obl:tmod": 283,
1113
+ "NOUN|Case=Tem|r-parataxis": 284,
1114
+ "NOUN|Case=Tem|r-xcomp": 285,
1115
+ "NOUN|Case=Tem|root": 286,
1116
+ "NOUN|Degree=Pos|_": 287,
1117
+ "NOUN|Degree=Pos|root": 288,
1118
+ "NOUN|NounType=Clf|_": 289,
1119
+ "NOUN|NounType=Clf|l-clf": 290,
1120
+ "NOUN|NounType=Clf|l-nmod": 291,
1121
+ "NOUN|NounType=Clf|l-nsubj": 292,
1122
+ "NOUN|NounType=Clf|l-obl": 293,
1123
+ "NOUN|NounType=Clf|r-ccomp": 294,
1124
+ "NOUN|NounType=Clf|r-clf": 295,
1125
+ "NOUN|NounType=Clf|r-compound:redup": 296,
1126
+ "NOUN|NounType=Clf|r-conj": 297,
1127
+ "NOUN|NounType=Clf|r-flat": 298,
1128
+ "NOUN|NounType=Clf|r-obj": 299,
1129
+ "NOUN|NounType=Clf|r-parataxis": 300,
1130
+ "NOUN|NounType=Clf|root": 301,
1131
+ "NOUN|_": 302,
1132
+ "NOUN|l-acl": 303,
1133
+ "NOUN|l-advcl": 304,
1134
+ "NOUN|l-amod": 305,
1135
+ "NOUN|l-ccomp": 306,
1136
+ "NOUN|l-clf": 307,
1137
+ "NOUN|l-compound": 308,
1138
+ "NOUN|l-csubj": 309,
1139
+ "NOUN|l-csubj:outer": 310,
1140
+ "NOUN|l-dislocated": 311,
1141
+ "NOUN|l-iobj": 312,
1142
+ "NOUN|l-list": 313,
1143
+ "NOUN|l-nmod": 314,
1144
+ "NOUN|l-nsubj": 315,
1145
+ "NOUN|l-nsubj:outer": 316,
1146
+ "NOUN|l-nsubj:pass": 317,
1147
+ "NOUN|l-obj": 318,
1148
+ "NOUN|l-obl": 319,
1149
+ "NOUN|l-obl:lmod": 320,
1150
+ "NOUN|l-obl:tmod": 321,
1151
+ "NOUN|l-vocative": 322,
1152
+ "NOUN|r-acl": 323,
1153
+ "NOUN|r-advcl": 324,
1154
+ "NOUN|r-amod": 325,
1155
+ "NOUN|r-ccomp": 326,
1156
+ "NOUN|r-clf": 327,
1157
+ "NOUN|r-compound:redup": 328,
1158
+ "NOUN|r-conj": 329,
1159
+ "NOUN|r-csubj": 330,
1160
+ "NOUN|r-dislocated": 331,
1161
+ "NOUN|r-flat": 332,
1162
+ "NOUN|r-flat:foreign": 333,
1163
+ "NOUN|r-iobj": 334,
1164
+ "NOUN|r-list": 335,
1165
+ "NOUN|r-nmod": 336,
1166
+ "NOUN|r-nsubj": 337,
1167
+ "NOUN|r-obj": 338,
1168
+ "NOUN|r-obl": 339,
1169
+ "NOUN|r-obl:lmod": 340,
1170
+ "NOUN|r-parataxis": 341,
1171
+ "NOUN|r-vocative": 342,
1172
+ "NOUN|r-xcomp": 343,
1173
+ "NOUN|root": 344,
1174
+ "NUM": 345,
1175
+ "NUM.": 346,
1176
+ "NUM|NumType=Ord|_": 347,
1177
+ "NUM|NumType=Ord|l-nsubj": 348,
1178
+ "NUM|NumType=Ord|l-nummod": 349,
1179
+ "NUM|NumType=Ord|l-obl": 350,
1180
+ "NUM|NumType=Ord|l-obl:lmod": 351,
1181
+ "NUM|NumType=Ord|l-obl:tmod": 352,
1182
+ "NUM|NumType=Ord|r-conj": 353,
1183
+ "NUM|NumType=Ord|r-flat": 354,
1184
+ "NUM|NumType=Ord|r-obj": 355,
1185
+ "NUM|NumType=Ord|root": 356,
1186
+ "NUM|_": 357,
1187
+ "NUM|l-acl": 358,
1188
+ "NUM|l-advcl": 359,
1189
+ "NUM|l-compound": 360,
1190
+ "NUM|l-csubj": 361,
1191
+ "NUM|l-dislocated": 362,
1192
+ "NUM|l-nsubj": 363,
1193
+ "NUM|l-nsubj:outer": 364,
1194
+ "NUM|l-nummod": 365,
1195
+ "NUM|l-obj": 366,
1196
+ "NUM|l-obl": 367,
1197
+ "NUM|l-obl:lmod": 368,
1198
+ "NUM|l-obl:tmod": 369,
1199
+ "NUM|r-ccomp": 370,
1200
+ "NUM|r-clf": 371,
1201
+ "NUM|r-compound": 372,
1202
+ "NUM|r-compound:redup": 373,
1203
+ "NUM|r-conj": 374,
1204
+ "NUM|r-flat": 375,
1205
+ "NUM|r-iobj": 376,
1206
+ "NUM|r-list": 377,
1207
+ "NUM|r-nummod": 378,
1208
+ "NUM|r-obj": 379,
1209
+ "NUM|r-obl": 380,
1210
+ "NUM|r-obl:tmod": 381,
1211
+ "NUM|r-parataxis": 382,
1212
+ "NUM|r-xcomp": 383,
1213
+ "NUM|root": 384,
1214
+ "PART": 385,
1215
+ "PART.": 386,
1216
+ "PART|_": 387,
1217
+ "PART|l-acl": 388,
1218
+ "PART|l-advcl": 389,
1219
+ "PART|l-advmod": 390,
1220
+ "PART|l-amod": 391,
1221
+ "PART|l-case": 392,
1222
+ "PART|l-cc": 393,
1223
+ "PART|l-csubj": 394,
1224
+ "PART|l-csubj:outer": 395,
1225
+ "PART|l-discourse": 396,
1226
+ "PART|l-discourse:sp": 397,
1227
+ "PART|l-dislocated": 398,
1228
+ "PART|l-mark": 399,
1229
+ "PART|l-nmod": 400,
1230
+ "PART|l-nsubj": 401,
1231
+ "PART|l-nsubj:outer": 402,
1232
+ "PART|l-nsubj:pass": 403,
1233
+ "PART|l-obj": 404,
1234
+ "PART|l-obl": 405,
1235
+ "PART|l-obl:lmod": 406,
1236
+ "PART|r-advmod": 407,
1237
+ "PART|r-case": 408,
1238
+ "PART|r-ccomp": 409,
1239
+ "PART|r-clf": 410,
1240
+ "PART|r-conj": 411,
1241
+ "PART|r-discourse": 412,
1242
+ "PART|r-discourse:sp": 413,
1243
+ "PART|r-dislocated": 414,
1244
+ "PART|r-fixed": 415,
1245
+ "PART|r-flat": 416,
1246
+ "PART|r-iobj": 417,
1247
+ "PART|r-list": 418,
1248
+ "PART|r-mark": 419,
1249
+ "PART|r-nsubj": 420,
1250
+ "PART|r-obj": 421,
1251
+ "PART|r-obl": 422,
1252
+ "PART|r-parataxis": 423,
1253
+ "PART|r-xcomp": 424,
1254
+ "PART|root": 425,
1255
+ "PRON": 426,
1256
+ "PRON.": 427,
1257
+ "PRON|Person=1|PronType=Prs|_": 428,
1258
+ "PRON|Person=1|PronType=Prs|l-acl": 429,
1259
+ "PRON|Person=1|PronType=Prs|l-advcl": 430,
1260
+ "PRON|Person=1|PronType=Prs|l-det": 431,
1261
+ "PRON|Person=1|PronType=Prs|l-iobj": 432,
1262
+ "PRON|Person=1|PronType=Prs|l-nsubj": 433,
1263
+ "PRON|Person=1|PronType=Prs|l-nsubj:outer": 434,
1264
+ "PRON|Person=1|PronType=Prs|l-obj": 435,
1265
+ "PRON|Person=1|PronType=Prs|l-obl": 436,
1266
+ "PRON|Person=1|PronType=Prs|l-vocative": 437,
1267
+ "PRON|Person=1|PronType=Prs|r-ccomp": 438,
1268
+ "PRON|Person=1|PronType=Prs|r-conj": 439,
1269
+ "PRON|Person=1|PronType=Prs|r-iobj": 440,
1270
+ "PRON|Person=1|PronType=Prs|r-nsubj": 441,
1271
+ "PRON|Person=1|PronType=Prs|r-obj": 442,
1272
+ "PRON|Person=1|PronType=Prs|r-obl": 443,
1273
+ "PRON|Person=1|PronType=Prs|r-obl:lmod": 444,
1274
+ "PRON|Person=1|PronType=Prs|root": 445,
1275
+ "PRON|Person=2|PronType=Prs|_": 446,
1276
+ "PRON|Person=2|PronType=Prs|l-advcl": 447,
1277
+ "PRON|Person=2|PronType=Prs|l-amod": 448,
1278
+ "PRON|Person=2|PronType=Prs|l-det": 449,
1279
+ "PRON|Person=2|PronType=Prs|l-nmod": 450,
1280
+ "PRON|Person=2|PronType=Prs|l-nsubj": 451,
1281
+ "PRON|Person=2|PronType=Prs|l-nsubj:outer": 452,
1282
+ "PRON|Person=2|PronType=Prs|l-obj": 453,
1283
+ "PRON|Person=2|PronType=Prs|l-obl": 454,
1284
+ "PRON|Person=2|PronType=Prs|l-vocative": 455,
1285
+ "PRON|Person=2|PronType=Prs|r-conj": 456,
1286
+ "PRON|Person=2|PronType=Prs|r-flat": 457,
1287
+ "PRON|Person=2|PronType=Prs|r-iobj": 458,
1288
+ "PRON|Person=2|PronType=Prs|r-obj": 459,
1289
+ "PRON|Person=2|PronType=Prs|r-obl": 460,
1290
+ "PRON|Person=2|PronType=Prs|root": 461,
1291
+ "PRON|Person=3|PronType=Prs|_": 462,
1292
+ "PRON|Person=3|PronType=Prs|l-advcl": 463,
1293
+ "PRON|Person=3|PronType=Prs|l-amod": 464,
1294
+ "PRON|Person=3|PronType=Prs|l-det": 465,
1295
+ "PRON|Person=3|PronType=Prs|l-dislocated": 466,
1296
+ "PRON|Person=3|PronType=Prs|l-expl": 467,
1297
+ "PRON|Person=3|PronType=Prs|l-iobj": 468,
1298
+ "PRON|Person=3|PronType=Prs|l-nsubj": 469,
1299
+ "PRON|Person=3|PronType=Prs|l-nsubj:outer": 470,
1300
+ "PRON|Person=3|PronType=Prs|l-nsubj:pass": 471,
1301
+ "PRON|Person=3|PronType=Prs|l-obj": 472,
1302
+ "PRON|Person=3|PronType=Prs|l-obl": 473,
1303
+ "PRON|Person=3|PronType=Prs|r-ccomp": 474,
1304
+ "PRON|Person=3|PronType=Prs|r-conj": 475,
1305
+ "PRON|Person=3|PronType=Prs|r-expl": 476,
1306
+ "PRON|Person=3|PronType=Prs|r-iobj": 477,
1307
+ "PRON|Person=3|PronType=Prs|r-nsubj": 478,
1308
+ "PRON|Person=3|PronType=Prs|r-obj": 479,
1309
+ "PRON|Person=3|PronType=Prs|r-obl": 480,
1310
+ "PRON|Person=3|PronType=Prs|root": 481,
1311
+ "PRON|PronType=Dem|_": 482,
1312
+ "PRON|PronType=Dem|l-acl": 483,
1313
+ "PRON|PronType=Dem|l-advcl": 484,
1314
+ "PRON|PronType=Dem|l-amod": 485,
1315
+ "PRON|PronType=Dem|l-compound": 486,
1316
+ "PRON|PronType=Dem|l-det": 487,
1317
+ "PRON|PronType=Dem|l-dislocated": 488,
1318
+ "PRON|PronType=Dem|l-expl": 489,
1319
+ "PRON|PronType=Dem|l-nsubj": 490,
1320
+ "PRON|PronType=Dem|l-nsubj:outer": 491,
1321
+ "PRON|PronType=Dem|l-obj": 492,
1322
+ "PRON|PronType=Dem|l-obl": 493,
1323
+ "PRON|PronType=Dem|l-obl:lmod": 494,
1324
+ "PRON|PronType=Dem|r-conj": 495,
1325
+ "PRON|PronType=Dem|r-det": 496,
1326
+ "PRON|PronType=Dem|r-expl": 497,
1327
+ "PRON|PronType=Dem|r-flat": 498,
1328
+ "PRON|PronType=Dem|r-iobj": 499,
1329
+ "PRON|PronType=Dem|r-obj": 500,
1330
+ "PRON|PronType=Dem|r-obl": 501,
1331
+ "PRON|PronType=Dem|r-obl:lmod": 502,
1332
+ "PRON|PronType=Dem|root": 503,
1333
+ "PRON|PronType=Int|_": 504,
1334
+ "PRON|PronType=Int|l-advcl": 505,
1335
+ "PRON|PronType=Int|l-amod": 506,
1336
+ "PRON|PronType=Int|l-det": 507,
1337
+ "PRON|PronType=Int|l-dislocated": 508,
1338
+ "PRON|PronType=Int|l-nsubj": 509,
1339
+ "PRON|PronType=Int|l-nsubj:outer": 510,
1340
+ "PRON|PronType=Int|l-obj": 511,
1341
+ "PRON|PronType=Int|l-obl": 512,
1342
+ "PRON|PronType=Int|l-vocative": 513,
1343
+ "PRON|PronType=Int|r-ccomp": 514,
1344
+ "PRON|PronType=Int|r-conj": 515,
1345
+ "PRON|PronType=Int|r-flat": 516,
1346
+ "PRON|PronType=Int|r-obj": 517,
1347
+ "PRON|PronType=Int|r-parataxis": 518,
1348
+ "PRON|PronType=Int|r-xcomp": 519,
1349
+ "PRON|PronType=Int|root": 520,
1350
+ "PRON|PronType=Prs|Reflex=Yes|_": 521,
1351
+ "PRON|PronType=Prs|Reflex=Yes|l-acl": 522,
1352
+ "PRON|PronType=Prs|Reflex=Yes|l-det": 523,
1353
+ "PRON|PronType=Prs|Reflex=Yes|l-nsubj": 524,
1354
+ "PRON|PronType=Prs|Reflex=Yes|l-obj": 525,
1355
+ "PRON|PronType=Prs|Reflex=Yes|l-obl": 526,
1356
+ "PRON|PronType=Prs|Reflex=Yes|r-dislocated": 527,
1357
+ "PRON|PronType=Prs|Reflex=Yes|r-obj": 528,
1358
+ "PRON|PronType=Prs|Reflex=Yes|r-obl": 529,
1359
+ "PRON|PronType=Prs|Reflex=Yes|root": 530,
1360
+ "PRON|PronType=Prs|_": 531,
1361
+ "PRON|PronType=Prs|l-det": 532,
1362
+ "PRON|PronType=Prs|l-nsubj": 533,
1363
+ "PRON|PronType=Prs|l-nsubj:outer": 534,
1364
+ "PRON|PronType=Prs|l-obj": 535,
1365
+ "PRON|PronType=Prs|r-conj": 536,
1366
+ "PRON|PronType=Prs|r-iobj": 537,
1367
+ "PRON|PronType=Prs|r-obj": 538,
1368
+ "PROPN": 539,
1369
+ "PROPN.": 540,
1370
+ "PROPN|Case=Loc|NameType=Geo|_": 541,
1371
+ "PROPN|Case=Loc|NameType=Geo|l-acl": 542,
1372
+ "PROPN|Case=Loc|NameType=Geo|l-advcl": 543,
1373
+ "PROPN|Case=Loc|NameType=Geo|l-amod": 544,
1374
+ "PROPN|Case=Loc|NameType=Geo|l-compound": 545,
1375
+ "PROPN|Case=Loc|NameType=Geo|l-csubj": 546,
1376
+ "PROPN|Case=Loc|NameType=Geo|l-dislocated": 547,
1377
+ "PROPN|Case=Loc|NameType=Geo|l-nmod": 548,
1378
+ "PROPN|Case=Loc|NameType=Geo|l-nsubj": 549,
1379
+ "PROPN|Case=Loc|NameType=Geo|l-nsubj:outer": 550,
1380
+ "PROPN|Case=Loc|NameType=Geo|l-obl": 551,
1381
+ "PROPN|Case=Loc|NameType=Geo|l-obl:lmod": 552,
1382
+ "PROPN|Case=Loc|NameType=Geo|r-conj": 553,
1383
+ "PROPN|Case=Loc|NameType=Geo|r-flat": 554,
1384
+ "PROPN|Case=Loc|NameType=Geo|r-iobj": 555,
1385
+ "PROPN|Case=Loc|NameType=Geo|r-obj": 556,
1386
+ "PROPN|Case=Loc|NameType=Geo|r-obl": 557,
1387
+ "PROPN|Case=Loc|NameType=Geo|r-obl:lmod": 558,
1388
+ "PROPN|Case=Loc|NameType=Geo|r-parataxis": 559,
1389
+ "PROPN|Case=Loc|NameType=Geo|r-xcomp": 560,
1390
+ "PROPN|Case=Loc|NameType=Geo|root": 561,
1391
+ "PROPN|Case=Loc|NameType=Nat|_": 562,
1392
+ "PROPN|Case=Loc|NameType=Nat|l-acl": 563,
1393
+ "PROPN|Case=Loc|NameType=Nat|l-advcl": 564,
1394
+ "PROPN|Case=Loc|NameType=Nat|l-amod": 565,
1395
+ "PROPN|Case=Loc|NameType=Nat|l-clf": 566,
1396
+ "PROPN|Case=Loc|NameType=Nat|l-compound": 567,
1397
+ "PROPN|Case=Loc|NameType=Nat|l-nmod": 568,
1398
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj": 569,
1399
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj:outer": 570,
1400
+ "PROPN|Case=Loc|NameType=Nat|l-nsubj:pass": 571,
1401
+ "PROPN|Case=Loc|NameType=Nat|l-obj": 572,
1402
+ "PROPN|Case=Loc|NameType=Nat|l-obl": 573,
1403
+ "PROPN|Case=Loc|NameType=Nat|l-obl:lmod": 574,
1404
+ "PROPN|Case=Loc|NameType=Nat|r-ccomp": 575,
1405
+ "PROPN|Case=Loc|NameType=Nat|r-conj": 576,
1406
+ "PROPN|Case=Loc|NameType=Nat|r-flat": 577,
1407
+ "PROPN|Case=Loc|NameType=Nat|r-iobj": 578,
1408
+ "PROPN|Case=Loc|NameType=Nat|r-nmod": 579,
1409
+ "PROPN|Case=Loc|NameType=Nat|r-obj": 580,
1410
+ "PROPN|Case=Loc|NameType=Nat|r-obl": 581,
1411
+ "PROPN|Case=Loc|NameType=Nat|r-obl:lmod": 582,
1412
+ "PROPN|Case=Loc|NameType=Nat|r-parataxis": 583,
1413
+ "PROPN|Case=Loc|NameType=Nat|r-xcomp": 584,
1414
+ "PROPN|Case=Loc|NameType=Nat|root": 585,
1415
+ "PROPN|NameType=Giv|_": 586,
1416
+ "PROPN|NameType=Giv|l-acl": 587,
1417
+ "PROPN|NameType=Giv|l-advcl": 588,
1418
+ "PROPN|NameType=Giv|l-amod": 589,
1419
+ "PROPN|NameType=Giv|l-compound": 590,
1420
+ "PROPN|NameType=Giv|l-dislocated": 591,
1421
+ "PROPN|NameType=Giv|l-nmod": 592,
1422
+ "PROPN|NameType=Giv|l-nsubj": 593,
1423
+ "PROPN|NameType=Giv|l-nsubj:outer": 594,
1424
+ "PROPN|NameType=Giv|l-nsubj:pass": 595,
1425
+ "PROPN|NameType=Giv|l-obj": 596,
1426
+ "PROPN|NameType=Giv|l-obl": 597,
1427
+ "PROPN|NameType=Giv|l-obl:lmod": 598,
1428
+ "PROPN|NameType=Giv|l-parataxis": 599,
1429
+ "PROPN|NameType=Giv|l-vocative": 600,
1430
+ "PROPN|NameType=Giv|r-appos": 601,
1431
+ "PROPN|NameType=Giv|r-ccomp": 602,
1432
+ "PROPN|NameType=Giv|r-conj": 603,
1433
+ "PROPN|NameType=Giv|r-dislocated": 604,
1434
+ "PROPN|NameType=Giv|r-flat": 605,
1435
+ "PROPN|NameType=Giv|r-iobj": 606,
1436
+ "PROPN|NameType=Giv|r-list": 607,
1437
+ "PROPN|NameType=Giv|r-nmod": 608,
1438
+ "PROPN|NameType=Giv|r-obj": 609,
1439
+ "PROPN|NameType=Giv|r-obl": 610,
1440
+ "PROPN|NameType=Giv|r-obl:lmod": 611,
1441
+ "PROPN|NameType=Giv|r-parataxis": 612,
1442
+ "PROPN|NameType=Giv|r-xcomp": 613,
1443
+ "PROPN|NameType=Giv|root": 614,
1444
+ "PROPN|NameType=Prs|_": 615,
1445
+ "PROPN|NameType=Prs|l-acl": 616,
1446
+ "PROPN|NameType=Prs|l-advcl": 617,
1447
+ "PROPN|NameType=Prs|l-amod": 618,
1448
+ "PROPN|NameType=Prs|l-compound": 619,
1449
+ "PROPN|NameType=Prs|l-dislocated": 620,
1450
+ "PROPN|NameType=Prs|l-nmod": 621,
1451
+ "PROPN|NameType=Prs|l-nsubj": 622,
1452
+ "PROPN|NameType=Prs|l-nsubj:outer": 623,
1453
+ "PROPN|NameType=Prs|l-obj": 624,
1454
+ "PROPN|NameType=Prs|l-obl": 625,
1455
+ "PROPN|NameType=Prs|r-conj": 626,
1456
+ "PROPN|NameType=Prs|r-dislocated": 627,
1457
+ "PROPN|NameType=Prs|r-flat": 628,
1458
+ "PROPN|NameType=Prs|r-iobj": 629,
1459
+ "PROPN|NameType=Prs|r-obj": 630,
1460
+ "PROPN|NameType=Prs|r-obl": 631,
1461
+ "PROPN|NameType=Prs|r-parataxis": 632,
1462
+ "PROPN|NameType=Prs|root": 633,
1463
+ "PROPN|NameType=Sur|_": 634,
1464
+ "PROPN|NameType=Sur|l-acl": 635,
1465
+ "PROPN|NameType=Sur|l-advcl": 636,
1466
+ "PROPN|NameType=Sur|l-amod": 637,
1467
+ "PROPN|NameType=Sur|l-compound": 638,
1468
+ "PROPN|NameType=Sur|l-csubj": 639,
1469
+ "PROPN|NameType=Sur|l-dislocated": 640,
1470
+ "PROPN|NameType=Sur|l-nmod": 641,
1471
+ "PROPN|NameType=Sur|l-nsubj": 642,
1472
+ "PROPN|NameType=Sur|l-nsubj:outer": 643,
1473
+ "PROPN|NameType=Sur|l-nsubj:pass": 644,
1474
+ "PROPN|NameType=Sur|l-obl": 645,
1475
+ "PROPN|NameType=Sur|l-obl:lmod": 646,
1476
+ "PROPN|NameType=Sur|l-vocative": 647,
1477
+ "PROPN|NameType=Sur|r-ccomp": 648,
1478
+ "PROPN|NameType=Sur|r-conj": 649,
1479
+ "PROPN|NameType=Sur|r-dislocated": 650,
1480
+ "PROPN|NameType=Sur|r-flat": 651,
1481
+ "PROPN|NameType=Sur|r-iobj": 652,
1482
+ "PROPN|NameType=Sur|r-list": 653,
1483
+ "PROPN|NameType=Sur|r-nmod": 654,
1484
+ "PROPN|NameType=Sur|r-nsubj": 655,
1485
+ "PROPN|NameType=Sur|r-obj": 656,
1486
+ "PROPN|NameType=Sur|r-obl": 657,
1487
+ "PROPN|NameType=Sur|r-obl:lmod": 658,
1488
+ "PROPN|NameType=Sur|r-parataxis": 659,
1489
+ "PROPN|NameType=Sur|r-xcomp": 660,
1490
+ "PROPN|NameType=Sur|root": 661,
1491
+ "PROPN|_": 662,
1492
+ "PROPN|l-nmod": 663,
1493
+ "PUNCT": 664,
1494
+ "PUNCT.": 665,
1495
+ "PUNCT|_": 666,
1496
+ "PUNCT|root": 667,
1497
+ "SCONJ": 668,
1498
+ "SCONJ.": 669,
1499
+ "SCONJ|_": 670,
1500
+ "SCONJ|l-case": 671,
1501
+ "SCONJ|l-cc": 672,
1502
+ "SCONJ|l-mark": 673,
1503
+ "SCONJ|l-nsubj": 674,
1504
+ "SCONJ|l-obl": 675,
1505
+ "SCONJ|r-case": 676,
1506
+ "SCONJ|r-iobj": 677,
1507
+ "SCONJ|r-mark": 678,
1508
+ "SCONJ|r-nsubj": 679,
1509
+ "SCONJ|r-nsubj:pass": 680,
1510
+ "SCONJ|r-obj": 681,
1511
+ "SCONJ|root": 682,
1512
+ "SYM": 683,
1513
+ "SYM.": 684,
1514
+ "SYM|_": 685,
1515
+ "SYM|l-nmod": 686,
1516
+ "SYM|l-nsubj": 687,
1517
+ "SYM|r-conj": 688,
1518
+ "SYM|r-nmod": 689,
1519
+ "SYM|r-xcomp": 690,
1520
+ "SYM|root": 691,
1521
+ "VERB": 692,
1522
+ "VERB.": 693,
1523
+ "VERB|Degree=Equ|VerbForm=Part|_": 694,
1524
+ "VERB|Degree=Equ|VerbForm=Part|l-amod": 695,
1525
+ "VERB|Degree=Equ|_": 696,
1526
+ "VERB|Degree=Equ|l-acl": 697,
1527
+ "VERB|Degree=Equ|l-advcl": 698,
1528
+ "VERB|Degree=Equ|l-ccomp": 699,
1529
+ "VERB|Degree=Equ|l-csubj": 700,
1530
+ "VERB|Degree=Equ|l-nsubj": 701,
1531
+ "VERB|Degree=Equ|l-obj": 702,
1532
+ "VERB|Degree=Equ|r-ccomp": 703,
1533
+ "VERB|Degree=Equ|r-compound:redup": 704,
1534
+ "VERB|Degree=Equ|r-conj": 705,
1535
+ "VERB|Degree=Equ|r-obj": 706,
1536
+ "VERB|Degree=Equ|r-parataxis": 707,
1537
+ "VERB|Degree=Equ|r-xcomp": 708,
1538
+ "VERB|Degree=Equ|root": 709,
1539
+ "VERB|Degree=Pos|VerbForm=Part|_": 710,
1540
+ "VERB|Degree=Pos|VerbForm=Part|l-amod": 711,
1541
+ "VERB|Degree=Pos|VerbForm=Part|r-amod": 712,
1542
+ "VERB|Degree=Pos|_": 713,
1543
+ "VERB|Degree=Pos|l-acl": 714,
1544
+ "VERB|Degree=Pos|l-advcl": 715,
1545
+ "VERB|Degree=Pos|l-ccomp": 716,
1546
+ "VERB|Degree=Pos|l-csubj": 717,
1547
+ "VERB|Degree=Pos|l-csubj:outer": 718,
1548
+ "VERB|Degree=Pos|l-dislocated": 719,
1549
+ "VERB|Degree=Pos|l-nsubj": 720,
1550
+ "VERB|Degree=Pos|l-nsubj:outer": 721,
1551
+ "VERB|Degree=Pos|l-obj": 722,
1552
+ "VERB|Degree=Pos|l-obl": 723,
1553
+ "VERB|Degree=Pos|l-vocative": 724,
1554
+ "VERB|Degree=Pos|r-advcl": 725,
1555
+ "VERB|Degree=Pos|r-ccomp": 726,
1556
+ "VERB|Degree=Pos|r-compound:redup": 727,
1557
+ "VERB|Degree=Pos|r-conj": 728,
1558
+ "VERB|Degree=Pos|r-dislocated": 729,
1559
+ "VERB|Degree=Pos|r-fixed": 730,
1560
+ "VERB|Degree=Pos|r-flat:vv": 731,
1561
+ "VERB|Degree=Pos|r-iobj": 732,
1562
+ "VERB|Degree=Pos|r-obj": 733,
1563
+ "VERB|Degree=Pos|r-obl": 734,
1564
+ "VERB|Degree=Pos|r-parataxis": 735,
1565
+ "VERB|Degree=Pos|r-xcomp": 736,
1566
+ "VERB|Degree=Pos|root": 737,
1567
+ "VERB|Polarity=Neg|VerbForm=Part|_": 738,
1568
+ "VERB|Polarity=Neg|VerbForm=Part|l-amod": 739,
1569
+ "VERB|Polarity=Neg|_": 740,
1570
+ "VERB|Polarity=Neg|l-acl": 741,
1571
+ "VERB|Polarity=Neg|l-advcl": 742,
1572
+ "VERB|Polarity=Neg|l-ccomp": 743,
1573
+ "VERB|Polarity=Neg|l-csubj": 744,
1574
+ "VERB|Polarity=Neg|l-csubj:outer": 745,
1575
+ "VERB|Polarity=Neg|l-nsubj": 746,
1576
+ "VERB|Polarity=Neg|l-obl": 747,
1577
+ "VERB|Polarity=Neg|r-advcl": 748,
1578
+ "VERB|Polarity=Neg|r-ccomp": 749,
1579
+ "VERB|Polarity=Neg|r-conj": 750,
1580
+ "VERB|Polarity=Neg|r-flat:vv": 751,
1581
+ "VERB|Polarity=Neg|r-obj": 752,
1582
+ "VERB|Polarity=Neg|r-obl": 753,
1583
+ "VERB|Polarity=Neg|r-parataxis": 754,
1584
+ "VERB|Polarity=Neg|r-xcomp": 755,
1585
+ "VERB|Polarity=Neg|root": 756,
1586
+ "VERB|VerbForm=Part|_": 757,
1587
+ "VERB|VerbForm=Part|l-amod": 758,
1588
+ "VERB|VerbForm=Part|r-amod": 759,
1589
+ "VERB|_": 760,
1590
+ "VERB|l-acl": 761,
1591
+ "VERB|l-advcl": 762,
1592
+ "VERB|l-ccomp": 763,
1593
+ "VERB|l-csubj": 764,
1594
+ "VERB|l-csubj:outer": 765,
1595
+ "VERB|l-csubj:pass": 766,
1596
+ "VERB|l-dislocated": 767,
1597
+ "VERB|l-nsubj": 768,
1598
+ "VERB|l-nsubj:outer": 769,
1599
+ "VERB|l-obj": 770,
1600
+ "VERB|l-obl": 771,
1601
+ "VERB|l-obl:lmod": 772,
1602
+ "VERB|l-parataxis": 773,
1603
+ "VERB|r-acl": 774,
1604
+ "VERB|r-advcl": 775,
1605
+ "VERB|r-ccomp": 776,
1606
+ "VERB|r-compound:redup": 777,
1607
+ "VERB|r-conj": 778,
1608
+ "VERB|r-dislocated": 779,
1609
+ "VERB|r-fixed": 780,
1610
+ "VERB|r-flat:vv": 781,
1611
+ "VERB|r-iobj": 782,
1612
+ "VERB|r-list": 783,
1613
+ "VERB|r-obj": 784,
1614
+ "VERB|r-obl": 785,
1615
+ "VERB|r-obl:lmod": 786,
1616
+ "VERB|r-parataxis": 787,
1617
+ "VERB|r-vocative": 788,
1618
+ "VERB|r-xcomp": 789,
1619
+ "VERB|root": 790
1620
+ },
1621
+ "layer_norm_eps": 1e-05,
1622
+ "local_attention": 128,
1623
+ "local_rope_theta": 10000.0,
1624
+ "max_position_embeddings": 8192,
1625
+ "mlp_bias": false,
1626
+ "mlp_dropout": 0.0,
1627
+ "model_type": "modernbert",
1628
+ "norm_bias": false,
1629
+ "norm_eps": 1e-05,
1630
+ "num_attention_heads": 16,
1631
+ "num_hidden_layers": 28,
1632
+ "pad_token_id": 1,
1633
+ "position_embedding_type": "absolute",
1634
+ "reference_compile": true,
1635
+ "repad_logits_with_grad": false,
1636
+ "sep_token_id": 2,
1637
+ "sparse_pred_ignore_index": -100,
1638
+ "sparse_prediction": false,
1639
+ "tokenizer_class": "BertTokenizerFast",
1640
+ "torch_dtype": "float32",
1641
+ "transformers_version": "4.48.3",
1642
+ "vocab_size": 25078
1643
+ }
maker.py ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #! /usr/bin/python3
2
+ src="KoichiYasuoka/modernbert-large-classical-chinese"
3
+ tgt="KoichiYasuoka/modernbert-large-classical-chinese-ud-embeds"
4
+ url="https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto"
5
+ import os
6
+ d=os.path.basename(url)
7
+ os.system("test -d "+d+" || git clone --depth=1 "+url)
8
+ os.system("for F in train dev test ; do cp "+d+"/*-$F.conllu $F.conllu ; done")
9
+ class UDEmbedsDataset(object):
10
+ def __init__(self,conllu,tokenizer,embeddings=None):
11
+ self.conllu=open(conllu,"r",encoding="utf-8")
12
+ self.tokenizer=tokenizer
13
+ self.embeddings=embeddings
14
+ self.seeks=[0]
15
+ label=set(["SYM","SYM.","SYM|_"])
16
+ dep=set()
17
+ s=self.conllu.readline()
18
+ while s!="":
19
+ if s=="\n":
20
+ self.seeks.append(self.conllu.tell())
21
+ else:
22
+ w=s.split("\t")
23
+ if len(w)==10:
24
+ if w[0].isdecimal():
25
+ p=w[3]
26
+ q="" if w[5]=="_" else "|"+w[5]
27
+ d=("|" if w[6]=="0" else "|l-" if int(w[0])<int(w[6]) else "|r-")+w[7]
28
+ for k in [p,p+".","B-"+p,"B-"+p+".","I-"+p,"I-"+p+".",p+q+"|_",p+q+d]:
29
+ label.add(k)
30
+ s=self.conllu.readline()
31
+ self.label2id={l:i for i,l in enumerate(sorted(label))}
32
+ def __call__(*args):
33
+ lid={l:i for i,l in enumerate(sorted(set(sum([list(t.label2id) for t in args],[]))))}
34
+ for t in args:
35
+ t.label2id=lid
36
+ return lid
37
+ def __del__(self):
38
+ self.conllu.close()
39
+ __len__=lambda self:(len(self.seeks)-1)*2
40
+ def __getitem__(self,i):
41
+ self.conllu.seek(self.seeks[int(i/2)])
42
+ z,c,t,s=i%2,[],[""],False
43
+ while t[0]!="\n":
44
+ t=self.conllu.readline().split("\t")
45
+ if len(t)==10 and t[0].isdecimal():
46
+ if s:
47
+ t[1]=" "+t[1]
48
+ c.append(t)
49
+ s=t[9].find("SpaceAfter=No")<0
50
+ x=[True if t[6]=="0" or int(t[6])>j or sum([1 if int(c[i][6])==j+1 else 0 for i in range(j+1,len(c))])>0 else False for j,t in enumerate(c)]
51
+ v=self.tokenizer([t[1] for t in c],add_special_tokens=False)["input_ids"]
52
+ if z==0:
53
+ ids,upos=[self.tokenizer.cls_token_id],["SYM."]
54
+ for i,(j,k) in enumerate(zip(v,c)):
55
+ if j==[]:
56
+ j=[self.tokenizer.unk_token_id]
57
+ p=k[3] if x[i] else k[3]+"."
58
+ ids+=j
59
+ upos+=[p] if len(j)==1 else ["B-"+p]+["I-"+p]*(len(j)-1)
60
+ ids.append(self.tokenizer.sep_token_id)
61
+ upos.append("SYM.")
62
+ emb=self.embeddings
63
+ else:
64
+ import torch
65
+ if len(x)<127:
66
+ x=[True]*len(x)
67
+ w=(len(x)+1)*(len(x)+2)/2
68
+ else:
69
+ w=sum([len(x)-i+1 if b else 0 for i,b in enumerate(x)])+1
70
+ for i in range(len(x)):
71
+ if x[i]==False and w+len(x)-i<8192:
72
+ x[i]=True
73
+ w+=len(x)-i+1
74
+ p=[t[3] if t[5]=="_" else t[3]+"|"+t[5] for i,t in enumerate(c)]
75
+ d=[t[7] if t[6]=="0" else "l-"+t[7] if int(t[0])<int(t[6]) else "r-"+t[7] for t in c]
76
+ ids,upos=[-1],["SYM|_"]
77
+ for i in range(len(x)):
78
+ if x[i]:
79
+ ids.append(i)
80
+ upos.append(p[i]+"|"+d[i] if c[i][6]=="0" else p[i]+"|_")
81
+ for j in range(i+1,len(x)):
82
+ ids.append(j)
83
+ upos.append(p[j]+"|"+d[j] if int(c[j][6])==i+1 else p[i]+"|"+d[i] if int(c[i][6])==j+1 else p[j]+"|_")
84
+ if i>0 and w>8192:
85
+ while w>8192:
86
+ if upos[-1].endswith("|_"):
87
+ upos.pop(-1)
88
+ ids.pop(-1)
89
+ w-=1
90
+ else:
91
+ break
92
+ ids.append(-1)
93
+ upos.append("SYM|_")
94
+ with torch.no_grad():
95
+ m=[]
96
+ for j in v:
97
+ if j==[]:
98
+ j=[self.tokenizer.unk_token_id]
99
+ m.append(self.embeddings[j,:].sum(axis=0))
100
+ m.append(self.embeddings[self.tokenizer.sep_token_id,:])
101
+ emb=torch.stack(m)
102
+ return{"inputs_embeds":emb[ids,:],"labels":[self.label2id[p] for p in upos]}
103
+ from transformers import AutoTokenizer,AutoConfig,AutoModelForTokenClassification,DefaultDataCollator,TrainingArguments,Trainer
104
+ from tokenizers.pre_tokenizers import Sequence,Split
105
+ from tokenizers import Regex
106
+ tkz=AutoTokenizer.from_pretrained(src)
107
+ trainDS=UDEmbedsDataset("train.conllu",tkz)
108
+ devDS=UDEmbedsDataset("dev.conllu",tkz)
109
+ testDS=UDEmbedsDataset("test.conllu",tkz)
110
+ lid=trainDS(devDS,testDS)
111
+ cfg=AutoConfig.from_pretrained(src,num_labels=len(lid),label2id=lid,id2label={i:l for l,i in lid.items()},ignore_mismatched_sizes=True,trust_remote_code=True)
112
+ mdl=AutoModelForTokenClassification.from_pretrained(src,config=cfg,ignore_mismatched_sizes=True,trust_remote_code=True)
113
+ trainDS.embeddings=mdl.get_input_embeddings().weight
114
+ arg=TrainingArguments(num_train_epochs=10,per_device_train_batch_size=1,dataloader_pin_memory=False,output_dir=tgt,overwrite_output_dir=True,save_total_limit=2,learning_rate=5e-05,warmup_ratio=0.1,save_safetensors=False)
115
+ trn=Trainer(args=arg,data_collator=DefaultDataCollator(),model=mdl,train_dataset=trainDS)
116
+ trn.train()
117
+ trn.save_model(tgt)
118
+ tkz.save_pretrained(tgt)
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12c29d71dc584bc88920e315b685564502dbb93ba85590d6a8399acfad2b871e
3
+ size 1483037698
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[CLS]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[PAD]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": false,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_input_names": [
51
+ "input_ids",
52
+ "attention_mask"
53
+ ],
54
+ "model_max_length": 1000000000000000019884624838656,
55
+ "never_split": [
56
+ "[CLS]",
57
+ "[PAD]",
58
+ "[SEP]",
59
+ "[UNK]",
60
+ "[MASK]"
61
+ ],
62
+ "pad_token": "[PAD]",
63
+ "sep_token": "[SEP]",
64
+ "strip_accents": false,
65
+ "tokenize_chinese_chars": true,
66
+ "tokenizer_class": "BertTokenizerFast",
67
+ "unk_token": "[UNK]"
68
+ }
ud.py ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy
2
+ from transformers import TokenClassificationPipeline
3
+
4
+ class BellmanFordTokenClassificationPipeline(TokenClassificationPipeline):
5
+ def __init__(self,**kwargs):
6
+ super().__init__(**kwargs)
7
+ x=self.model.config.label2id
8
+ y=[k for k in x if k.find("|")<0 and not k.startswith("I-")]
9
+ self.transition=numpy.full((len(x),len(x)),-numpy.inf)
10
+ for k,v in x.items():
11
+ if k.find("|")<0:
12
+ for j in ["I-"+k[2:]] if k.startswith("B-") else [k]+y if k.startswith("I-") else y:
13
+ self.transition[v,x[j]]=0
14
+ def check_model_type(self,supported_models):
15
+ pass
16
+ def postprocess(self,model_outputs,**kwargs):
17
+ if "logits" not in model_outputs:
18
+ return self.postprocess(model_outputs[0],**kwargs)
19
+ return self.bellman_ford_token_classification(model_outputs,**kwargs)
20
+ def bellman_ford_token_classification(self,model_outputs,**kwargs):
21
+ m=model_outputs["logits"][0].numpy()
22
+ e=numpy.exp(m-numpy.max(m,axis=-1,keepdims=True))
23
+ z=e/e.sum(axis=-1,keepdims=True)
24
+ for i in range(m.shape[0]-1,0,-1):
25
+ m[i-1]+=numpy.max(m[i]+self.transition,axis=1)
26
+ k=[numpy.argmax(m[0]+self.transition[0])]
27
+ for i in range(1,m.shape[0]):
28
+ k.append(numpy.argmax(m[i]+self.transition[k[-1]]))
29
+ w=[{"entity":self.model.config.id2label[j],"start":s,"end":e,"score":z[i,j]} for i,((s,e),j) in enumerate(zip(model_outputs["offset_mapping"][0].tolist(),k)) if s<e]
30
+ if "aggregation_strategy" in kwargs and kwargs["aggregation_strategy"]!="none":
31
+ for i,t in reversed(list(enumerate(w))):
32
+ p=t.pop("entity")
33
+ if p.startswith("I-"):
34
+ w[i-1]["score"]=min(w[i-1]["score"],t["score"])
35
+ w[i-1]["end"]=w.pop(i)["end"]
36
+ elif p.startswith("B-"):
37
+ t["entity_group"]=p[2:]
38
+ else:
39
+ t["entity_group"]=p
40
+ for t in w:
41
+ t["text"]=model_outputs["sentence"][t["start"]:t["end"]]
42
+ return w
43
+
44
+ class UniversalDependenciesPipeline(BellmanFordTokenClassificationPipeline):
45
+ def __init__(self,**kwargs):
46
+ kwargs["aggregation_strategy"]="simple"
47
+ super().__init__(**kwargs)
48
+ x=self.model.config.label2id
49
+ self.root=numpy.full((len(x)),-numpy.inf)
50
+ self.left_arc=numpy.full((len(x)),-numpy.inf)
51
+ self.right_arc=numpy.full((len(x)),-numpy.inf)
52
+ for k,v in x.items():
53
+ if k.endswith("|root"):
54
+ self.root[v]=0
55
+ elif k.find("|l-")>0:
56
+ self.left_arc[v]=0
57
+ elif k.find("|r-")>0:
58
+ self.right_arc[v]=0
59
+ def postprocess(self,model_outputs,**kwargs):
60
+ import torch
61
+ kwargs["aggregation_strategy"]="simple"
62
+ if "logits" not in model_outputs:
63
+ return self.postprocess(model_outputs[0],**kwargs)
64
+ w=self.bellman_ford_token_classification(model_outputs,**kwargs)
65
+ off=[(t["start"],t["end"]) for t in w]
66
+ for i,(s,e) in reversed(list(enumerate(off))):
67
+ if s<e:
68
+ d=w[i]["text"]
69
+ j=len(d)-len(d.lstrip())
70
+ if j>0:
71
+ d=d.lstrip()
72
+ off[i]=(off[i][0]+j,off[i][1])
73
+ j=len(d)-len(d.rstrip())
74
+ if j>0:
75
+ d=d.rstrip()
76
+ off[i]=(off[i][0],off[i][1]-j)
77
+ if d.strip()=="":
78
+ off.pop(i)
79
+ w.pop(i)
80
+ v=self.tokenizer([t["text"] for t in w],add_special_tokens=False)
81
+ x=[not t["entity_group"].endswith(".") for t in w]
82
+ if len(x)<127:
83
+ x=[True]*len(x)
84
+ else:
85
+ k=sum([len(x)-i+1 if b else 0 for i,b in enumerate(x)])+1
86
+ for i in numpy.argsort(numpy.array([t["score"] for t in w])):
87
+ if x[i]==False and k+len(x)-i<8192:
88
+ x[i]=True
89
+ k+=len(x)-i+1
90
+ ids=[-1]
91
+ for i in range(len(x)):
92
+ if x[i]:
93
+ ids.append(i)
94
+ for j in range(i+1,len(x)):
95
+ ids.append(j)
96
+ ids.append(-1)
97
+ with torch.no_grad():
98
+ e=self.model.get_input_embeddings().weight
99
+ m=[]
100
+ for j in v["input_ids"]:
101
+ if j==[]:
102
+ j=[self.tokenizer.unk_token_id]
103
+ m.append(e[j,:].sum(axis=0))
104
+ m.append(e[self.tokenizer.sep_token_id,:])
105
+ m=torch.stack(m).to(self.device)
106
+ e=self.model(inputs_embeds=torch.unsqueeze(m[ids,:],0))
107
+ m=e.logits[0].cpu().numpy()
108
+ e=numpy.full((len(x),len(x),m.shape[-1]),m.min())
109
+ k=1
110
+ for i in range(len(x)):
111
+ if x[i]:
112
+ e[i,i]=m[k]+self.root
113
+ k+=1
114
+ for j in range(1,len(x)-i):
115
+ e[i+j,i]=m[k]+self.left_arc
116
+ e[i,i+j]=m[k]+self.right_arc
117
+ k+=1
118
+ k+=1
119
+ m,p=numpy.max(e,axis=2),numpy.argmax(e,axis=2)
120
+ h=self.chu_liu_edmonds(m)
121
+ z=[i for i,j in enumerate(h) if i==j]
122
+ if len(z)>1:
123
+ k,h=z[numpy.argmax(m[z,z])],numpy.min(m)-numpy.max(m)
124
+ m[:,z]+=[[0 if j in z and (i!=j or i==k) else h for i in z] for j in range(m.shape[0])]
125
+ h=self.chu_liu_edmonds(m)
126
+ q=[self.model.config.id2label[p[j,i]].split("|") for i,j in enumerate(h)]
127
+ t=model_outputs["sentence"].replace("\n"," ")
128
+ u="# text = "+t+"\n"
129
+ for i,(s,e) in enumerate(off):
130
+ u+="\t".join([str(i+1),t[s:e],t[s:e],q[i][0],"_","_" if len(q[i])<3 else "|".join(q[i][1:-1]),str(0 if h[i]==i else h[i]+1),"root" if q[i][-1]=="root" else q[i][-1][2:],"_","_" if i+1<len(off) and e<off[i+1][0] else "SpaceAfter=No"])+"\n"
131
+ return u+"\n"
132
+ def chu_liu_edmonds(self,matrix):
133
+ h=numpy.argmax(matrix,axis=0)
134
+ x=[-1 if i==j else j for i,j in enumerate(h)]
135
+ for b in [lambda x,i,j:-1 if i not in x else x[i],lambda x,i,j:-1 if j<0 else x[j]]:
136
+ y=[]
137
+ while x!=y:
138
+ y=list(x)
139
+ for i,j in enumerate(x):
140
+ x[i]=b(x,i,j)
141
+ if max(x)<0:
142
+ return h
143
+ y,x=[i for i,j in enumerate(x) if j==max(x)],[i for i,j in enumerate(x) if j<max(x)]
144
+ z=matrix-numpy.max(matrix,axis=0)
145
+ m=numpy.block([[z[x,:][:,x],numpy.max(z[x,:][:,y],axis=1).reshape(len(x),1)],[numpy.max(z[y,:][:,x],axis=0),numpy.max(z[y,y])]])
146
+ k=[j if i==len(x) else x[j] if j<len(x) else y[numpy.argmax(z[y,x[i]])] for i,j in enumerate(self.chu_liu_edmonds(m))]
147
+ h=[j if i in y else k[x.index(i)] for i,j in enumerate(h)]
148
+ i=y[numpy.argmax(z[x[k[-1]],y] if k[-1]<len(x) else z[y,y])]
149
+ h[i]=x[k[-1]] if k[-1]<len(x) else i
150
+ return h
vocab.txt ADDED
The diff for this file is too large to render. See raw diff