johnnyboycurtis commited on
Commit
ebf6ee3
·
verified ·
1 Parent(s): f93b876

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,877 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - dense
10
+ - generated_from_trainer
11
+ - dataset_size:2637346
12
+ - loss:CachedMultipleNegativesSymmetricRankingLoss
13
+ - loss:CachedMultipleNegativesRankingLoss
14
+ - loss:CoSENTLoss
15
+ widget:
16
+ - source_sentence: A modern bathtub in a bathroom is displayed.
17
+ sentences:
18
+ - Different types of tiles are on the walls, floor and tub.
19
+ - A man sitting on a park bench looking towards a fountain and sculpture.
20
+ - A bathroom with a shower and his and her sinks.
21
+ - source_sentence: The people are sleeping.
22
+ sentences:
23
+ - A white dog swims in the water while holding a red object in its mouth.
24
+ - A man and young boy asleep in a chair.
25
+ - A group of people sit in an open, plaza-like area with large bushes and victorian-styled
26
+ buildings in a row behind them, many of which are made indistinct by a heavy blur
27
+ on the right side of the picture.
28
+ - source_sentence: A man is playing the drums.
29
+ sentences:
30
+ - A man plays the drum.
31
+ - A woman is swimming in the water.
32
+ - The lady peeled the shrimp.
33
+ - source_sentence: who sings i'm so tired of being alone
34
+ sentences:
35
+ - Tree of life (biology) The term phylogeny for the evolutionary relationships of
36
+ species through time was coined by Ernst Haeckel, who went further than Darwin
37
+ in proposing phylogenic histories of life. In contemporary usage, tree of life
38
+ refers to the compilation of comprehensive phylogenetic databases rooted at the
39
+ last universal common ancestor of life on Earth. The Open Tree of Life, first
40
+ published 2015, is a project to compile such a database for free public access.
41
+ - The Thomas Crown Affair (1968 film) The Thomas Crown Affair is a 1968 film directed
42
+ and produced by Norman Jewison and starring Steve McQueen and Faye Dunaway. This
43
+ heist film was nominated for two Academy Awards, winning Best Original Song for
44
+ Michel Legrand's "Windmills of Your Mind". A remake was released in 1999 and a
45
+ second remake is currently in the development stages.
46
+ - 'Tired of Being Alone In addition to Texas, "Tired of Being Alone" has also been
47
+ covered by Michael Bolton, Tom Jones, the Subdudes and by Eran James. Graham Bonnet
48
+ of Rainbow, MSG, and Alcatrazz fame covered "Tired of Being Alone" on 1977''s
49
+ "Graham Bonnet". The soul group Quiet Elegance, who were stablemates at Hi Records
50
+ with Green and had toured with him, also released a cover of the song on their
51
+ albums You''ve Got My Mind Messed Up (1990) and The Complete Quiet Elegance (2003).
52
+ Tarja Turunen covered the song on her 2012 album Act I: Live in Rosario. American
53
+ singer Sybil released a cover as a non-album single in 1996, peaking at #53 in
54
+ the UK. The original Al Green version was featured in the 1995 film Dead Presidents.'
55
+ - source_sentence: A sleeping baby in a pink striped outfit.
56
+ sentences:
57
+ - Three young men and a young woman wearing sneakers are leaping in midair at the
58
+ top of a flight of concrete stairs.
59
+ - A little baby cradled in someones arms.
60
+ - A group of hikers traveling along a rock strewn creek bed.
61
+ datasets:
62
+ - sentence-transformers/all-nli
63
+ - sentence-transformers/quora-duplicates
64
+ - sentence-transformers/natural-questions
65
+ - sentence-transformers/stsb
66
+ - sentence-transformers/sentence-compression
67
+ - sentence-transformers/simple-wiki
68
+ - sentence-transformers/altlex
69
+ - sentence-transformers/coco-captions
70
+ - sentence-transformers/flickr30k-captions
71
+ - sentence-transformers/yahoo-answers
72
+ - sentence-transformers/stackexchange-duplicates
73
+ pipeline_tag: sentence-similarity
74
+ library_name: sentence-transformers
75
+ metrics:
76
+ - cosine_accuracy
77
+ - pearson_cosine
78
+ - spearman_cosine
79
+ model-index:
80
+ - name: ModernBERT-small for General Purpose Similarity
81
+ results:
82
+ - task:
83
+ type: triplet
84
+ name: Triplet
85
+ dataset:
86
+ name: all nli dev
87
+ type: all-nli-dev
88
+ metrics:
89
+ - type: cosine_accuracy
90
+ value: 0.8807715773582458
91
+ name: Cosine Accuracy
92
+ - task:
93
+ type: semantic-similarity
94
+ name: Semantic Similarity
95
+ dataset:
96
+ name: sts dev
97
+ type: sts-dev
98
+ metrics:
99
+ - type: pearson_cosine
100
+ value: 0.8290433363537696
101
+ name: Pearson Cosine
102
+ - type: spearman_cosine
103
+ value: 0.8276208329210781
104
+ name: Spearman Cosine
105
+ ---
106
+
107
+ # ModernBERT-small for General Purpose Similarity
108
+
109
+ This is a [sentence-transformers](https://www.SBERT.net) model trained on the [nli](https://huggingface.co/datasets/sentence-transformers/all-nli), [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates), [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [stsb](https://huggingface.co/datasets/sentence-transformers/stsb), [sentence_compression](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [simple_wiki](https://huggingface.co/datasets/sentence-transformers/simple-wiki), [altlex](https://huggingface.co/datasets/sentence-transformers/altlex), [coco_captions](https://huggingface.co/datasets/sentence-transformers/coco-captions), [flickr30k_captions](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions), [yahoo_answers](https://huggingface.co/datasets/sentence-transformers/yahoo-answers) and [stack_exchange](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates) datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
110
+
111
+ ## Model Details
112
+
113
+ ### Model Description
114
+ - **Model Type:** Sentence Transformer
115
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
116
+ - **Maximum Sequence Length:** 1024 tokens
117
+ - **Output Dimensionality:** 384 dimensions
118
+ - **Similarity Function:** Cosine Similarity
119
+ - **Training Datasets:**
120
+ - [nli](https://huggingface.co/datasets/sentence-transformers/all-nli)
121
+ - [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates)
122
+ - [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
123
+ - [stsb](https://huggingface.co/datasets/sentence-transformers/stsb)
124
+ - [sentence_compression](https://huggingface.co/datasets/sentence-transformers/sentence-compression)
125
+ - [simple_wiki](https://huggingface.co/datasets/sentence-transformers/simple-wiki)
126
+ - [altlex](https://huggingface.co/datasets/sentence-transformers/altlex)
127
+ - [coco_captions](https://huggingface.co/datasets/sentence-transformers/coco-captions)
128
+ - [flickr30k_captions](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions)
129
+ - [yahoo_answers](https://huggingface.co/datasets/sentence-transformers/yahoo-answers)
130
+ - [stack_exchange](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates)
131
+ - **Language:** en
132
+ - **License:** apache-2.0
133
+
134
+ ### Model Sources
135
+
136
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
137
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
138
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
139
+
140
+ ### Full Model Architecture
141
+
142
+ ```
143
+ SentenceTransformer(
144
+ (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
145
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
146
+ )
147
+ ```
148
+
149
+ ## Usage
150
+
151
+ ### Direct Usage (Sentence Transformers)
152
+
153
+ First install the Sentence Transformers library:
154
+
155
+ ```bash
156
+ pip install -U sentence-transformers
157
+ ```
158
+
159
+ Then you can load this model and run inference.
160
+ ```python
161
+ from sentence_transformers import SentenceTransformer
162
+
163
+ # Download from the 🤗 Hub
164
+ model = SentenceTransformer("sentence_transformers_model_id")
165
+ # Run inference
166
+ queries = [
167
+ "A sleeping baby in a pink striped outfit.",
168
+ ]
169
+ documents = [
170
+ 'A little baby cradled in someones arms.',
171
+ 'A group of hikers traveling along a rock strewn creek bed.',
172
+ 'Three young men and a young woman wearing sneakers are leaping in midair at the top of a flight of concrete stairs.',
173
+ ]
174
+ query_embeddings = model.encode_query(queries)
175
+ document_embeddings = model.encode_document(documents)
176
+ print(query_embeddings.shape, document_embeddings.shape)
177
+ # [1, 384] [3, 384]
178
+
179
+ # Get the similarity scores for the embeddings
180
+ similarities = model.similarity(query_embeddings, document_embeddings)
181
+ print(similarities)
182
+ # tensor([[ 0.5804, 0.0193, -0.1261]])
183
+ ```
184
+
185
+ <!--
186
+ ### Direct Usage (Transformers)
187
+
188
+ <details><summary>Click to see the direct usage in Transformers</summary>
189
+
190
+ </details>
191
+ -->
192
+
193
+ <!--
194
+ ### Downstream Usage (Sentence Transformers)
195
+
196
+ You can finetune this model on your own dataset.
197
+
198
+ <details><summary>Click to expand</summary>
199
+
200
+ </details>
201
+ -->
202
+
203
+ <!--
204
+ ### Out-of-Scope Use
205
+
206
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
207
+ -->
208
+
209
+ ## Evaluation
210
+
211
+ ### Metrics
212
+
213
+ #### Triplet
214
+
215
+ * Dataset: `all-nli-dev`
216
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
217
+
218
+ | Metric | Value |
219
+ |:--------------------|:-----------|
220
+ | **cosine_accuracy** | **0.8808** |
221
+
222
+ #### Semantic Similarity
223
+
224
+ * Dataset: `sts-dev`
225
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
226
+
227
+ | Metric | Value |
228
+ |:--------------------|:-----------|
229
+ | pearson_cosine | 0.829 |
230
+ | **spearman_cosine** | **0.8276** |
231
+
232
+ <!--
233
+ ## Bias, Risks and Limitations
234
+
235
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
236
+ -->
237
+
238
+ <!--
239
+ ### Recommendations
240
+
241
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
242
+ -->
243
+
244
+ ## Training Details
245
+
246
+ ### Training Datasets
247
+ <details><summary>nli</summary>
248
+
249
+ #### nli
250
+
251
+ * Dataset: [nli](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
252
+ * Size: 557,850 training samples
253
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
254
+ * Approximate statistics based on the first 1000 samples:
255
+ | | anchor | positive | negative |
256
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
257
+ | type | string | string | string |
258
+ | details | <ul><li>min: 7 tokens</li><li>mean: 10.46 tokens</li><li>max: 46 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.91 tokens</li><li>max: 40 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 13.49 tokens</li><li>max: 51 tokens</li></ul> |
259
+ * Samples:
260
+ | anchor | positive | negative |
261
+ |:---------------------------------------------------------------------------|:-------------------------------------------------|:-----------------------------------------------------------|
262
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> | <code>A person is at a diner, ordering an omelette.</code> |
263
+ | <code>Children smiling and waving at camera</code> | <code>There are children present</code> | <code>The kids are frowning</code> |
264
+ | <code>A boy is jumping on skateboard in the middle of a red bridge.</code> | <code>The boy does a skateboarding trick.</code> | <code>The boy skates down the sidewalk.</code> |
265
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
266
+ ```json
267
+ {
268
+ "scale": 20.0,
269
+ "similarity_fct": "cos_sim",
270
+ "mini_batch_size": 64
271
+ }
272
+ ```
273
+ </details>
274
+ <details><summary>quora</summary>
275
+
276
+ #### quora
277
+
278
+ * Dataset: [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
279
+ * Size: 101,762 training samples
280
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
281
+ * Approximate statistics based on the first 1000 samples:
282
+ | | anchor | positive | negative |
283
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
284
+ | type | string | string | string |
285
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.85 tokens</li><li>max: 42 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 13.63 tokens</li><li>max: 44 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 14.68 tokens</li><li>max: 61 tokens</li></ul> |
286
+ * Samples:
287
+ | anchor | positive | negative |
288
+ |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------|
289
+ | <code>Why in India do we not have one on one political debate as in USA?</code> | <code>Why cant we have a public debate between politicians in India like the one in US?</code> | <code>Can people on Quora stop India Pakistan debate? We are sick and tired seeing this everyday in bulk?</code> |
290
+ | <code>What is OnePlus One?</code> | <code>How is oneplus one?</code> | <code>Why is OnePlus One so good?</code> |
291
+ | <code>Does our mind control our emotions?</code> | <code>How do smart and successful people control their emotions?</code> | <code>How can I control my positive emotions for the people whom I love but they don't care about me?</code> |
292
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
293
+ ```json
294
+ {
295
+ "scale": 20.0,
296
+ "similarity_fct": "cos_sim",
297
+ "mini_batch_size": 64
298
+ }
299
+ ```
300
+ </details>
301
+ <details><summary>natural_questions</summary>
302
+
303
+ #### natural_questions
304
+
305
+ * Dataset: [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
306
+ * Size: 100,231 training samples
307
+ * Columns: <code>query</code> and <code>answer</code>
308
+ * Approximate statistics based on the first 1000 samples:
309
+ | | query | answer |
310
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
311
+ | type | string | string |
312
+ | details | <ul><li>min: 10 tokens</li><li>mean: 12.47 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 138.32 tokens</li><li>max: 556 tokens</li></ul> |
313
+ * Samples:
314
+ | query | answer |
315
+ |:----------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
316
+ | <code>when did richmond last play in a preliminary final</code> | <code>Richmond Football Club Richmond began 2017 with 5 straight wins, a feat it had not achieved since 1995. A series of close losses hampered the Tigers throughout the middle of the season, including a 5-point loss to the Western Bulldogs, 2-point loss to Fremantle, and a 3-point loss to the Giants. Richmond ended the season strongly with convincing victories over Fremantle and St Kilda in the final two rounds, elevating the club to 3rd on the ladder. Richmond's first final of the season against the Cats at the MCG attracted a record qualifying final crowd of 95,028; the Tigers won by 51 points. Having advanced to the first preliminary finals for the first time since 2001, Richmond defeated Greater Western Sydney by 36 points in front of a crowd of 94,258 to progress to the Grand Final against Adelaide, their first Grand Final appearance since 1982. The attendance was 100,021, the largest crowd to a grand final since 1986. The Crows led at quarter time and led by as many as 13, but the Tig...</code> |
317
+ | <code>who sang what in the world's come over you</code> | <code>Jack Scott (singer) At the beginning of 1960, Scott again changed record labels, this time to Top Rank Records.[1] He then recorded four Billboard Hot 100 hits – "What in the World's Come Over You" (#5), "Burning Bridges" (#3) b/w "Oh Little One" (#34), and "It Only Happened Yesterday" (#38).[1] "What in the World's Come Over You" was Scott's second gold disc winner.[6] Scott continued to record and perform during the 1960s and 1970s.[1] His song "You're Just Gettin' Better" reached the country charts in 1974.[1] In May 1977, Scott recorded a Peel session for BBC Radio 1 disc jockey, John Peel.</code> |
318
+ | <code>who produces the most wool in the world</code> | <code>Wool Global wool production is about 2 million tonnes per year, of which 60% goes into apparel. Wool comprises ca 3% of the global textile market, but its value is higher owing to dying and other modifications of the material.[1] Australia is a leading producer of wool which is mostly from Merino sheep but has been eclipsed by China in terms of total weight.[30] New Zealand (2016) is the third-largest producer of wool, and the largest producer of crossbred wool. Breeds such as Lincoln, Romney, Drysdale, and Elliotdale produce coarser fibers, and wool from these sheep is usually used for making carpets.</code> |
319
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
320
+ ```json
321
+ {
322
+ "scale": 20.0,
323
+ "similarity_fct": "cos_sim",
324
+ "mini_batch_size": 64
325
+ }
326
+ ```
327
+ </details>
328
+ <details><summary>stsb</summary>
329
+
330
+ #### stsb
331
+
332
+ * Dataset: [stsb](https://huggingface.co/datasets/sentence-transformers/stsb) at [ab7a5ac](https://huggingface.co/datasets/sentence-transformers/stsb/tree/ab7a5ac0e35aa22088bdcf23e7fd99b220e53308)
333
+ * Size: 5,749 training samples
334
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
335
+ * Approximate statistics based on the first 1000 samples:
336
+ | | sentence1 | sentence2 | score |
337
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
338
+ | type | string | string | float |
339
+ | details | <ul><li>min: 6 tokens</li><li>mean: 10.16 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 10.12 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
340
+ * Samples:
341
+ | sentence1 | sentence2 | score |
342
+ |:-----------------------------------------------------------|:----------------------------------------------------------------------|:------------------|
343
+ | <code>A plane is taking off.</code> | <code>An air plane is taking off.</code> | <code>1.0</code> |
344
+ | <code>A man is playing a large flute.</code> | <code>A man is playing a flute.</code> | <code>0.76</code> |
345
+ | <code>A man is spreading shreded cheese on a pizza.</code> | <code>A man is spreading shredded cheese on an uncooked pizza.</code> | <code>0.76</code> |
346
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
347
+ ```json
348
+ {
349
+ "scale": 20.0,
350
+ "similarity_fct": "pairwise_cos_sim"
351
+ }
352
+ ```
353
+ </details>
354
+ <details><summary>sentence_compression</summary>
355
+
356
+ #### sentence_compression
357
+
358
+ * Dataset: [sentence_compression](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
359
+ * Size: 180,000 training samples
360
+ * Columns: <code>text</code> and <code>simplified</code>
361
+ * Approximate statistics based on the first 1000 samples:
362
+ | | text | simplified |
363
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
364
+ | type | string | string |
365
+ | details | <ul><li>min: 12 tokens</li><li>mean: 33.95 tokens</li><li>max: 127 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 11.56 tokens</li><li>max: 29 tokens</li></ul> |
366
+ * Samples:
367
+ | text | simplified |
368
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------|
369
+ | <code>The USHL completed an expansion draft on Monday as 10 players who were on the rosters of USHL teams during the 2009-10 season were selected by the League's two newest entries, the Muskegon Lumberjacks and Dubuque Fighting Saints.</code> | <code>USHL completes expansion draft</code> |
370
+ | <code>Major League Baseball Commissioner Bud Selig will be speaking at St. Norbert College next month.</code> | <code>Bud Selig to speak at St. Norbert College</code> |
371
+ | <code>It's fresh cherry time in Michigan and the best time to enjoy this delicious and nutritious fruit.</code> | <code>It's cherry time</code> |
372
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
373
+ ```json
374
+ {
375
+ "scale": 20.0,
376
+ "similarity_fct": "cos_sim",
377
+ "mini_batch_size": 64
378
+ }
379
+ ```
380
+ </details>
381
+ <details><summary>simple_wiki</summary>
382
+
383
+ #### simple_wiki
384
+
385
+ * Dataset: [simple_wiki](https://huggingface.co/datasets/sentence-transformers/simple-wiki) at [60fd9b4](https://huggingface.co/datasets/sentence-transformers/simple-wiki/tree/60fd9b4680642ace0e2604cc2de44d376df419a7)
386
+ * Size: 102,225 training samples
387
+ * Columns: <code>text</code> and <code>simplified</code>
388
+ * Approximate statistics based on the first 1000 samples:
389
+ | | text | simplified |
390
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
391
+ | type | string | string |
392
+ | details | <ul><li>min: 9 tokens</li><li>mean: 35.55 tokens</li><li>max: 173 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 29.29 tokens</li><li>max: 135 tokens</li></ul> |
393
+ * Samples:
394
+ | text | simplified |
395
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
396
+ | <code>The greatest example has been in his present job ( then , Minister for Foreign Affairs ) , where he has perforce concentrated on Anglo-Irish relations and , in particular the North ( i.e. , Northern Ireland ) .</code> | <code>The greatest example has been in his present job ( then , Minister for Foreign Affairs ) , where he has perforce concentrated on Anglo-Irish relations and , in particular Northern Ireland ( .</code> |
397
+ | <code>His reputation rose further when opposition leaders under parliamentary privilege alleged that Taoiseach Charles Haughey , who in January 1982 had been Leader of the Opposition , had not merely rung the President 's Office but threatened to end the career of the army officer who took the call and who , on Hillery 's explicit instructions , had refused to put through the call to the President .</code> | <code>President Hillery refused to speak to any opposition party politicians , but when Charles Haughey , who was Leader of the Opposition , had rang the President 's Office he threatened to end the career of the army officer answered and refused on Hillery 's explicit orders to put the call through to the President .</code> |
398
+ | <code>He considered returning to medicine , perhaps moving with his wife , Maeve ( also a doctor ) to Africa .</code> | <code>He thought about returning to medicine , perhaps moving with his wife , Maeve ( also a doctor ) to Africa .</code> |
399
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
400
+ ```json
401
+ {
402
+ "scale": 20.0,
403
+ "similarity_fct": "cos_sim",
404
+ "mini_batch_size": 64
405
+ }
406
+ ```
407
+ </details>
408
+ <details><summary>altlex</summary>
409
+
410
+ #### altlex
411
+
412
+ * Dataset: [altlex](https://huggingface.co/datasets/sentence-transformers/altlex) at [97eb209](https://huggingface.co/datasets/sentence-transformers/altlex/tree/97eb20963455c361d5a81c107c3596cff9e0cd82)
413
+ * Size: 112,696 training samples
414
+ * Columns: <code>text</code> and <code>simplified</code>
415
+ * Approximate statistics based on the first 1000 samples:
416
+ | | text | simplified |
417
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
418
+ | type | string | string |
419
+ | details | <ul><li>min: 9 tokens</li><li>mean: 32.19 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 26.81 tokens</li><li>max: 115 tokens</li></ul> |
420
+ * Samples:
421
+ | text | simplified |
422
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
423
+ | <code>A set of 31 guns , cast 1729-1749 by the first master founder at the Royal Foundry , later the Royal Arsenal , Woolwich , were used to fire salutes until 1907 , often for Queen Victoria , who was a frequent visitor .</code> | <code>A set of 31 guns , cast 1729-1749 by the first master founder at the Royal Foundry , later the Royal Arsenal , Woolwich , were used to fire salutes until 1907 , often for Queen Victoria who was a frequent visitor .</code> |
424
+ | <code>In 1929 , the building became vacant , and was given to Prince Edward , Prince of Wales , by his father , King George V . This became the Prince 's chief residence and was used extensively by him for entertaining and as a country retreat .</code> | <code>In 1929 , the building became vacant , and was given to Prince Edward , the Prince of Wales by his father , King George V . This became the Prince 's chief residence , and was used extensively by the Prince for entertaining and as a country retreat .</code> |
425
+ | <code>Additions included an octagon room in the north-east side , in which the King regularly had dinner .</code> | <code>Additions included an octagon room in the North-East side , where the King regularly had dinner .</code> |
426
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
427
+ ```json
428
+ {
429
+ "scale": 20.0,
430
+ "similarity_fct": "cos_sim",
431
+ "mini_batch_size": 64
432
+ }
433
+ ```
434
+ </details>
435
+ <details><summary>coco_captions</summary>
436
+
437
+ #### coco_captions
438
+
439
+ * Dataset: [coco_captions](https://huggingface.co/datasets/sentence-transformers/coco-captions) at [bd26018](https://huggingface.co/datasets/sentence-transformers/coco-captions/tree/bd2601822b9af9a41656d678ffbd5c80d81e276a)
440
+ * Size: 414,010 training samples
441
+ * Columns: <code>caption1</code> and <code>caption2</code>
442
+ * Approximate statistics based on the first 1000 samples:
443
+ | | caption1 | caption2 |
444
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
445
+ | type | string | string |
446
+ | details | <ul><li>min: 10 tokens</li><li>mean: 13.8 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 13.8 tokens</li><li>max: 27 tokens</li></ul> |
447
+ * Samples:
448
+ | caption1 | caption2 |
449
+ |:-------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|
450
+ | <code>A clock that blends in with the wall hangs in a bathroom. </code> | <code>A very clean and well decorated empty bathroom</code> |
451
+ | <code>A very clean and well decorated empty bathroom</code> | <code>A bathroom with a border of butterflies and blue paint on the walls above it.</code> |
452
+ | <code>A bathroom with a border of butterflies and blue paint on the walls above it.</code> | <code>An angled view of a beautifully decorated bathroom.</code> |
453
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
454
+ ```json
455
+ {
456
+ "scale": 20.0,
457
+ "similarity_fct": "cos_sim",
458
+ "mini_batch_size": 64
459
+ }
460
+ ```
461
+ </details>
462
+ <details><summary>flickr30k_captions</summary>
463
+
464
+ #### flickr30k_captions
465
+
466
+ * Dataset: [flickr30k_captions](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions) at [0ef0ce3](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions/tree/0ef0ce31492fd8dc161ed483a40d3c4894f9a8c1)
467
+ * Size: 158,881 training samples
468
+ * Columns: <code>caption1</code> and <code>caption2</code>
469
+ * Approximate statistics based on the first 1000 samples:
470
+ | | caption1 | caption2 |
471
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
472
+ | type | string | string |
473
+ | details | <ul><li>min: 6 tokens</li><li>mean: 16.41 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.41 tokens</li><li>max: 64 tokens</li></ul> |
474
+ * Samples:
475
+ | caption1 | caption2 |
476
+ |:------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|
477
+ | <code>Two men in green shirts are standing in a yard.</code> | <code>Two young, White males are outside near many bushes.</code> |
478
+ | <code>Two young, White males are outside near many bushes.</code> | <code>Two young guys with shaggy hair look at their hands while hanging out in the yard.</code> |
479
+ | <code>Two young guys with shaggy hair look at their hands while hanging out in the yard.</code> | <code>A man in a blue shirt standing in a garden.</code> |
480
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
481
+ ```json
482
+ {
483
+ "scale": 20.0,
484
+ "similarity_fct": "cos_sim",
485
+ "mini_batch_size": 64
486
+ }
487
+ ```
488
+ </details>
489
+ <details><summary>yahoo_answers</summary>
490
+
491
+ #### yahoo_answers
492
+
493
+ * Dataset: [yahoo_answers](https://huggingface.co/datasets/sentence-transformers/yahoo-answers) at [93b3605](https://huggingface.co/datasets/sentence-transformers/yahoo-answers/tree/93b3605c508cf93e3666c9d3e34640b5fe62b507)
494
+ * Size: 599,417 training samples
495
+ * Columns: <code>question</code> and <code>answer</code>
496
+ * Approximate statistics based on the first 1000 samples:
497
+ | | question | answer |
498
+ |:--------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
499
+ | type | string | string |
500
+ | details | <ul><li>min: 12 tokens</li><li>mean: 57.04 tokens</li><li>max: 309 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 115.16 tokens</li><li>max: 992 tokens</li></ul> |
501
+ * Samples:
502
+ | question | answer |
503
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
504
+ | <code>why doesn't an optical mouse work on a glass table? or even on some surfaces?</code> | <code>why doesn't an optical mouse work on a glass table? Optical mice use an LED and a camera to rapidly capture images of the surface beneath the mouse. The infomation from the camera is analyzed by a DSP (Digital Signal Processor) and used to detect imperfections in the underlying surface and determine motion. Some materials, such as glass, mirrors or other very shiny, uniform surfaces interfere with the ability of the DSP to accurately analyze the surface beneath the mouse. \nSince glass is transparent and very uniform, the mouse is unable to pick up enough imperfections in the underlying surface to determine motion. Mirrored surfaces are also a problem, since they constantly reflect back the same image, causing the DSP not to recognize motion properly. When the system is unable to see surface changes associated with movement, the mouse will not work properly.</code> |
505
+ | <code>What is the best off-road motorcycle trail ? long-distance trail throughout CA</code> | <code>What is the best off-road motorcycle trail ? i hear that the mojave road is amazing!<br />\nsearch for it online.</code> |
506
+ | <code>What is Trans Fat? How to reduce that? I heard that tras fat is bad for the body. Why is that? Where can we find it in our daily food?</code> | <code>What is Trans Fat? How to reduce that? Trans fats occur in manufactured foods during the process of partial hydrogenation, when hydrogen gas is bubbled through vegetable oil to increase shelf life and stabilize the original polyunsatured oil. The resulting fat is similar to saturated fat, which raises "bad" LDL cholesterol and can lead to clogged arteries and heart disease. \nUntil very recently, food labels were not required to list trans fats, and this health risk remained hidden to consumers. In early July, FDA regulations changed, and food labels will soon begin identifying trans fat content in processed foods.</code> |
507
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
508
+ ```json
509
+ {
510
+ "scale": 20.0,
511
+ "similarity_fct": "cos_sim",
512
+ "mini_batch_size": 64
513
+ }
514
+ ```
515
+ </details>
516
+ <details><summary>stack_exchange</summary>
517
+
518
+ #### stack_exchange
519
+
520
+ * Dataset: [stack_exchange](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates) at [1c9657a](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates/tree/1c9657aec12d9e101667bb9593efcc623c4a68ff)
521
+ * Size: 304,525 training samples
522
+ * Columns: <code>title1</code> and <code>title2</code>
523
+ * Approximate statistics based on the first 1000 samples:
524
+ | | title1 | title2 |
525
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
526
+ | type | string | string |
527
+ | details | <ul><li>min: 4 tokens</li><li>mean: 14.71 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 15.48 tokens</li><li>max: 71 tokens</li></ul> |
528
+ * Samples:
529
+ | title1 | title2 |
530
+ |:----------------------------------------------------------------------------------|:-------------------------------------------------------------|
531
+ | <code>what is the advantage of using the GPU rendering options in Android?</code> | <code>Can anyone explain all these Developer Options?</code> |
532
+ | <code>Blank video when converting uncompressed AVI files with ffmpeg</code> | <code>FFmpeg lossy compression problems</code> |
533
+ | <code>URL Rewriting of a query string in php</code> | <code>How to create friendly URL in php?</code> |
534
+ * Loss: [<code>CachedMultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) with these parameters:
535
+ ```json
536
+ {
537
+ "scale": 20.0,
538
+ "similarity_fct": "cos_sim",
539
+ "mini_batch_size": 64
540
+ }
541
+ ```
542
+ </details>
543
+
544
+ ### Training Hyperparameters
545
+ #### Non-Default Hyperparameters
546
+
547
+ - `eval_strategy`: steps
548
+ - `per_device_train_batch_size`: 128
549
+ - `learning_rate`: 0.0005
550
+ - `weight_decay`: 0.01
551
+ - `lr_scheduler_type`: cosine
552
+ - `warmup_ratio`: 0.05
553
+ - `bf16`: True
554
+ - `bf16_full_eval`: True
555
+ - `load_best_model_at_end`: True
556
+
557
+ #### All Hyperparameters
558
+ <details><summary>Click to expand</summary>
559
+
560
+ - `overwrite_output_dir`: False
561
+ - `do_predict`: False
562
+ - `eval_strategy`: steps
563
+ - `prediction_loss_only`: True
564
+ - `per_device_train_batch_size`: 128
565
+ - `per_device_eval_batch_size`: 8
566
+ - `per_gpu_train_batch_size`: None
567
+ - `per_gpu_eval_batch_size`: None
568
+ - `gradient_accumulation_steps`: 1
569
+ - `eval_accumulation_steps`: None
570
+ - `torch_empty_cache_steps`: None
571
+ - `learning_rate`: 0.0005
572
+ - `weight_decay`: 0.01
573
+ - `adam_beta1`: 0.9
574
+ - `adam_beta2`: 0.999
575
+ - `adam_epsilon`: 1e-08
576
+ - `max_grad_norm`: 1.0
577
+ - `num_train_epochs`: 3
578
+ - `max_steps`: -1
579
+ - `lr_scheduler_type`: cosine
580
+ - `lr_scheduler_kwargs`: {}
581
+ - `warmup_ratio`: 0.05
582
+ - `warmup_steps`: 0
583
+ - `log_level`: passive
584
+ - `log_level_replica`: warning
585
+ - `log_on_each_node`: True
586
+ - `logging_nan_inf_filter`: True
587
+ - `save_safetensors`: True
588
+ - `save_on_each_node`: False
589
+ - `save_only_model`: False
590
+ - `restore_callback_states_from_checkpoint`: False
591
+ - `no_cuda`: False
592
+ - `use_cpu`: False
593
+ - `use_mps_device`: False
594
+ - `seed`: 42
595
+ - `data_seed`: None
596
+ - `jit_mode_eval`: False
597
+ - `use_ipex`: False
598
+ - `bf16`: True
599
+ - `fp16`: False
600
+ - `fp16_opt_level`: O1
601
+ - `half_precision_backend`: auto
602
+ - `bf16_full_eval`: True
603
+ - `fp16_full_eval`: False
604
+ - `tf32`: None
605
+ - `local_rank`: 0
606
+ - `ddp_backend`: None
607
+ - `tpu_num_cores`: None
608
+ - `tpu_metrics_debug`: False
609
+ - `debug`: []
610
+ - `dataloader_drop_last`: False
611
+ - `dataloader_num_workers`: 0
612
+ - `dataloader_prefetch_factor`: None
613
+ - `past_index`: -1
614
+ - `disable_tqdm`: False
615
+ - `remove_unused_columns`: True
616
+ - `label_names`: None
617
+ - `load_best_model_at_end`: True
618
+ - `ignore_data_skip`: False
619
+ - `fsdp`: []
620
+ - `fsdp_min_num_params`: 0
621
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
622
+ - `fsdp_transformer_layer_cls_to_wrap`: None
623
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
624
+ - `deepspeed`: None
625
+ - `label_smoothing_factor`: 0.0
626
+ - `optim`: adamw_torch
627
+ - `optim_args`: None
628
+ - `adafactor`: False
629
+ - `group_by_length`: False
630
+ - `length_column_name`: length
631
+ - `ddp_find_unused_parameters`: None
632
+ - `ddp_bucket_cap_mb`: None
633
+ - `ddp_broadcast_buffers`: False
634
+ - `dataloader_pin_memory`: True
635
+ - `dataloader_persistent_workers`: False
636
+ - `skip_memory_metrics`: True
637
+ - `use_legacy_prediction_loop`: False
638
+ - `push_to_hub`: False
639
+ - `resume_from_checkpoint`: None
640
+ - `hub_model_id`: None
641
+ - `hub_strategy`: every_save
642
+ - `hub_private_repo`: None
643
+ - `hub_always_push`: False
644
+ - `hub_revision`: None
645
+ - `gradient_checkpointing`: False
646
+ - `gradient_checkpointing_kwargs`: None
647
+ - `include_inputs_for_metrics`: False
648
+ - `include_for_metrics`: []
649
+ - `eval_do_concat_batches`: True
650
+ - `fp16_backend`: auto
651
+ - `push_to_hub_model_id`: None
652
+ - `push_to_hub_organization`: None
653
+ - `mp_parameters`:
654
+ - `auto_find_batch_size`: False
655
+ - `full_determinism`: False
656
+ - `torchdynamo`: None
657
+ - `ray_scope`: last
658
+ - `ddp_timeout`: 1800
659
+ - `torch_compile`: False
660
+ - `torch_compile_backend`: None
661
+ - `torch_compile_mode`: None
662
+ - `include_tokens_per_second`: False
663
+ - `include_num_input_tokens_seen`: False
664
+ - `neftune_noise_alpha`: None
665
+ - `optim_target_modules`: None
666
+ - `batch_eval_metrics`: False
667
+ - `eval_on_start`: False
668
+ - `use_liger_kernel`: False
669
+ - `liger_kernel_config`: None
670
+ - `eval_use_gather_object`: False
671
+ - `average_tokens_across_devices`: False
672
+ - `prompts`: None
673
+ - `batch_sampler`: batch_sampler
674
+ - `multi_dataset_batch_sampler`: proportional
675
+ - `router_mapping`: {}
676
+ - `learning_rate_mapping`: {}
677
+
678
+ </details>
679
+
680
+ ### Training Logs
681
+ <details><summary>Click to expand</summary>
682
+
683
+ | Epoch | Step | Training Loss | all-nli-dev_cosine_accuracy | sts-dev_spearman_cosine |
684
+ |:----------:|:---------:|:-------------:|:---------------------------:|:-----------------------:|
685
+ | 0.0243 | 500 | 2.0912 | - | - |
686
+ | 0.0485 | 1000 | 1.4267 | - | - |
687
+ | 0.0728 | 1500 | 1.2426 | - | - |
688
+ | 0.0970 | 2000 | 1.0654 | 0.8136 | 0.7436 |
689
+ | 0.1213 | 2500 | 0.8238 | - | - |
690
+ | 0.1456 | 3000 | 0.8801 | - | - |
691
+ | 0.1698 | 3500 | 0.7807 | - | - |
692
+ | 0.1941 | 4000 | 0.7651 | 0.8284 | 0.7611 |
693
+ | 0.2183 | 4500 | 0.6838 | - | - |
694
+ | 0.2426 | 5000 | 0.6796 | - | - |
695
+ | 0.2668 | 5500 | 0.6014 | - | - |
696
+ | 0.2911 | 6000 | 0.5967 | 0.8360 | 0.7741 |
697
+ | 0.3154 | 6500 | 0.6318 | - | - |
698
+ | 0.3396 | 7000 | 0.5821 | - | - |
699
+ | 0.3639 | 7500 | 0.5258 | - | - |
700
+ | 0.3881 | 8000 | 0.6353 | 0.8463 | 0.7951 |
701
+ | 0.4124 | 8500 | 0.5788 | - | - |
702
+ | 0.4367 | 9000 | 0.5956 | - | - |
703
+ | 0.4609 | 9500 | 0.5453 | - | - |
704
+ | 0.4852 | 10000 | 0.5218 | 0.8522 | 0.7960 |
705
+ | 0.5094 | 10500 | 0.4546 | - | - |
706
+ | 0.5337 | 11000 | 0.5363 | - | - |
707
+ | 0.5580 | 11500 | 0.5055 | - | - |
708
+ | 0.5822 | 12000 | 0.5157 | 0.8574 | 0.8133 |
709
+ | 0.6065 | 12500 | 0.4474 | - | - |
710
+ | 0.6307 | 13000 | 0.5242 | - | - |
711
+ | 0.6550 | 13500 | 0.4406 | - | - |
712
+ | 0.6792 | 14000 | 0.4766 | 0.8628 | 0.8055 |
713
+ | 0.7035 | 14500 | 0.5492 | - | - |
714
+ | 0.7278 | 15000 | 0.4667 | - | - |
715
+ | 0.7520 | 15500 | 0.401 | - | - |
716
+ | 0.7763 | 16000 | 0.4805 | 0.8662 | 0.8041 |
717
+ | 0.8005 | 16500 | 0.4524 | - | - |
718
+ | 0.8248 | 17000 | 0.5427 | - | - |
719
+ | 0.8491 | 17500 | 0.44 | - | - |
720
+ | 0.8733 | 18000 | 0.4774 | 0.8691 | 0.8126 |
721
+ | 0.8976 | 18500 | 0.3869 | - | - |
722
+ | 0.9218 | 19000 | 0.4031 | - | - |
723
+ | 0.9461 | 19500 | 0.409 | - | - |
724
+ | 0.9704 | 20000 | 0.3779 | 0.8706 | 0.8220 |
725
+ | 0.9946 | 20500 | 0.3703 | - | - |
726
+ | 1.0189 | 21000 | 0.3279 | - | - |
727
+ | 1.0431 | 21500 | 0.2885 | - | - |
728
+ | 1.0674 | 22000 | 0.2838 | 0.8786 | 0.8185 |
729
+ | 1.0917 | 22500 | 0.3564 | - | - |
730
+ | 1.1159 | 23000 | 0.2787 | - | - |
731
+ | 1.1402 | 23500 | 0.3007 | - | - |
732
+ | 1.1644 | 24000 | 0.3477 | 0.8759 | 0.8215 |
733
+ | 1.1887 | 24500 | 0.3176 | - | - |
734
+ | 1.2129 | 25000 | 0.2671 | - | - |
735
+ | 1.2372 | 25500 | 0.3309 | - | - |
736
+ | 1.2615 | 26000 | 0.3487 | 0.8744 | 0.8201 |
737
+ | 1.2857 | 26500 | 0.3497 | - | - |
738
+ | 1.3100 | 27000 | 0.2859 | - | - |
739
+ | 1.3342 | 27500 | 0.3018 | - | - |
740
+ | 1.3585 | 28000 | 0.2812 | 0.8767 | 0.8229 |
741
+ | 1.3828 | 28500 | 0.3071 | - | - |
742
+ | 1.4070 | 29000 | 0.2609 | - | - |
743
+ | 1.4313 | 29500 | 0.3083 | - | - |
744
+ | 1.4555 | 30000 | 0.3113 | 0.8782 | 0.8253 |
745
+ | 1.4798 | 30500 | 0.279 | - | - |
746
+ | 1.5041 | 31000 | 0.3082 | - | - |
747
+ | 1.5283 | 31500 | 0.2824 | - | - |
748
+ | 1.5526 | 32000 | 0.2987 | 0.8786 | 0.8256 |
749
+ | 1.5768 | 32500 | 0.3417 | - | - |
750
+ | 1.6011 | 33000 | 0.3075 | - | - |
751
+ | 1.6253 | 33500 | 0.2631 | - | - |
752
+ | 1.6496 | 34000 | 0.2642 | 0.8773 | 0.8249 |
753
+ | 1.6739 | 34500 | 0.2804 | - | - |
754
+ | 1.6981 | 35000 | 0.244 | - | - |
755
+ | 1.7224 | 35500 | 0.29 | - | - |
756
+ | 1.7466 | 36000 | 0.251 | 0.8785 | 0.8262 |
757
+ | 1.7709 | 36500 | 0.2476 | - | - |
758
+ | 1.7952 | 37000 | 0.2807 | - | - |
759
+ | 1.8194 | 37500 | 0.2558 | - | - |
760
+ | 1.8437 | 38000 | 0.2536 | 0.8777 | 0.8285 |
761
+ | 1.8679 | 38500 | 0.2779 | - | - |
762
+ | 1.8922 | 39000 | 0.2567 | - | - |
763
+ | 1.9165 | 39500 | 0.3665 | - | - |
764
+ | **1.9407** | **40000** | **0.27** | **0.8796** | **0.8299** |
765
+ | 1.9650 | 40500 | 0.2635 | - | - |
766
+ | 1.9892 | 41000 | 0.2477 | - | - |
767
+ | 2.0135 | 41500 | 0.2386 | - | - |
768
+ | 2.0377 | 42000 | 0.2477 | 0.8783 | 0.8284 |
769
+ | 2.0620 | 42500 | 0.2396 | - | - |
770
+ | 2.0863 | 43000 | 0.1781 | - | - |
771
+ | 2.1105 | 43500 | 0.1858 | - | - |
772
+ | 2.1348 | 44000 | 0.1812 | 0.8791 | 0.8278 |
773
+ | 2.1590 | 44500 | 0.2185 | - | - |
774
+ | 2.1833 | 45000 | 0.2431 | - | - |
775
+ | 2.2076 | 45500 | 0.1812 | - | - |
776
+ | 2.2318 | 46000 | 0.2301 | 0.8806 | 0.8282 |
777
+ | 2.2561 | 46500 | 0.2169 | - | - |
778
+ | 2.2803 | 47000 | 0.2074 | - | - |
779
+ | 2.3046 | 47500 | 0.2229 | - | - |
780
+ | 2.3289 | 48000 | 0.2257 | 0.8803 | 0.8276 |
781
+ | 2.3531 | 48500 | 0.1867 | - | - |
782
+ | 2.3774 | 49000 | 0.2276 | - | - |
783
+ | 2.4016 | 49500 | 0.214 | - | - |
784
+ | 2.4259 | 50000 | 0.2085 | 0.8808 | 0.8276 |
785
+ | 2.4501 | 50500 | 0.2198 | - | - |
786
+ | 2.4744 | 51000 | 0.231 | - | - |
787
+ | 2.4987 | 51500 | 0.2395 | - | - |
788
+ | 2.5229 | 52000 | 0.2204 | 0.8808 | 0.8276 |
789
+ | 2.5472 | 52500 | 0.1864 | - | - |
790
+ | 2.5714 | 53000 | 0.3129 | - | - |
791
+ | 2.5957 | 53500 | 0.2224 | - | - |
792
+ | 2.6200 | 54000 | 0.1839 | 0.8808 | 0.8276 |
793
+ | 2.6442 | 54500 | 0.2032 | - | - |
794
+ | 2.6685 | 55000 | 0.246 | - | - |
795
+ | 2.6927 | 55500 | 0.199 | - | - |
796
+ | 2.7170 | 56000 | 0.2089 | 0.8808 | 0.8276 |
797
+ | 2.7413 | 56500 | 0.2235 | - | - |
798
+ | 2.7655 | 57000 | 0.2168 | - | - |
799
+ | 2.7898 | 57500 | 0.2063 | - | - |
800
+ | 2.8140 | 58000 | 0.2202 | 0.8808 | 0.8276 |
801
+ | 2.8383 | 58500 | 0.2077 | - | - |
802
+ | 2.8625 | 59000 | 0.1876 | - | - |
803
+ | 2.8868 | 59500 | 0.2204 | - | - |
804
+ | 2.9111 | 60000 | 0.2248 | 0.8808 | 0.8276 |
805
+ | 2.9353 | 60500 | 0.1974 | - | - |
806
+ | 2.9596 | 61000 | 0.2084 | - | - |
807
+ | 2.9838 | 61500 | 0.2312 | - | - |
808
+
809
+ * The bold row denotes the saved checkpoint.
810
+ </details>
811
+
812
+ ### Framework Versions
813
+ - Python: 3.11.13
814
+ - Sentence Transformers: 5.0.0
815
+ - Transformers: 4.53.1
816
+ - PyTorch: 2.7.1+cu128
817
+ - Accelerate: 1.8.1
818
+ - Datasets: 4.0.0
819
+ - Tokenizers: 0.21.2
820
+
821
+ ## Citation
822
+
823
+ ### BibTeX
824
+
825
+ #### Sentence Transformers
826
+ ```bibtex
827
+ @inproceedings{reimers-2019-sentence-bert,
828
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
829
+ author = "Reimers, Nils and Gurevych, Iryna",
830
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
831
+ month = "11",
832
+ year = "2019",
833
+ publisher = "Association for Computational Linguistics",
834
+ url = "https://arxiv.org/abs/1908.10084",
835
+ }
836
+ ```
837
+
838
+ #### CachedMultipleNegativesRankingLoss
839
+ ```bibtex
840
+ @misc{gao2021scaling,
841
+ title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
842
+ author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
843
+ year={2021},
844
+ eprint={2101.06983},
845
+ archivePrefix={arXiv},
846
+ primaryClass={cs.LG}
847
+ }
848
+ ```
849
+
850
+ #### CoSENTLoss
851
+ ```bibtex
852
+ @online{kexuefm-8847,
853
+ title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
854
+ author={Su Jianlin},
855
+ year={2022},
856
+ month={Jan},
857
+ url={https://kexue.fm/archives/8847},
858
+ }
859
+ ```
860
+
861
+ <!--
862
+ ## Glossary
863
+
864
+ *Clearly define terms in order to be accessible across audiences.*
865
+ -->
866
+
867
+ <!--
868
+ ## Model Card Authors
869
+
870
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
871
+ -->
872
+
873
+ <!--
874
+ ## Model Card Contact
875
+
876
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
877
+ -->
config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "cls",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "hidden_activation": "gelu",
20
+ "hidden_size": 384,
21
+ "initializer_cutoff_factor": 2.0,
22
+ "initializer_range": 0.02,
23
+ "intermediate_size": 1536,
24
+ "local_attention": 128,
25
+ "local_rope_theta": 10000.0,
26
+ "max_position_embeddings": 1024,
27
+ "mlp_bias": false,
28
+ "mlp_dropout": 0.0,
29
+ "model_type": "modernbert",
30
+ "norm_bias": false,
31
+ "norm_eps": 1e-05,
32
+ "num_attention_heads": 6,
33
+ "num_hidden_layers": 12,
34
+ "pad_token_id": 50283,
35
+ "repad_logits_with_grad": false,
36
+ "sep_token_id": 50282,
37
+ "sparse_pred_ignore_index": -100,
38
+ "sparse_prediction": false,
39
+ "torch_dtype": "bfloat16",
40
+ "transformers_version": "4.53.1",
41
+ "vocab_size": 50368
42
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.0.0",
5
+ "transformers": "4.53.1",
6
+ "pytorch": "2.7.1+cu128"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec78b1cc85be07aefcdef82bb36553c3d487a7948f0e9f22856542b0d6149df9
3
+ size 95332048
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 1024,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }