szoplakz commited on
Commit
75bc9d4
·
verified ·
1 Parent(s): 16755ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -407
README.md CHANGED
@@ -35,410 +35,3 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [k
35
  - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
36
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
37
 
38
- ### Full Model Architecture
39
-
40
- ```
41
- SentenceTransformer(
42
- (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: RobertaModel
43
- (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
44
- )
45
- ```
46
-
47
- ## Usage
48
-
49
- ### Direct Usage (Sentence Transformers)
50
-
51
- First install the Sentence Transformers library:
52
-
53
- ```bash
54
- pip install -U sentence-transformers
55
- ```
56
-
57
- Then you can load this model and run inference.
58
- ```python
59
- from sentence_transformers import SentenceTransformer
60
-
61
- # Download from the 🤗 Hub
62
- model = SentenceTransformer("sentence_transformers_model_id")
63
- # Run inference
64
- sentences = [
65
- 'Ustanovenie tohto odseku platí aj v prípade zmeny majiteľa zmenky alebo postúpenia práva zo\u2028zmenky.',
66
-
67
- ]
68
- embeddings = model.encode(sentences)
69
- print(embeddings.shape)
70
- # [, 768]
71
-
72
- # Get the similarity scores for the embeddings
73
- similarities = model.similarity(embeddings, embeddings)
74
- print(similarities.shape)
75
- # [3, 3]
76
- ```
77
-
78
- <!--
79
- ### Direct Usage (Transformers)
80
-
81
- <details><summary>Click to see the direct usage in Transformers</summary>
82
-
83
- </details>
84
- -->
85
-
86
- <!--
87
- ### Downstream Usage (Sentence Transformers)
88
-
89
- You can finetune this model on your own dataset.
90
-
91
- <details><summary>Click to expand</summary>
92
-
93
- </details>
94
- -->
95
-
96
- <!--
97
- ### Out-of-Scope Use
98
-
99
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
100
- -->
101
-
102
- <!--
103
- ## Bias, Risks and Limitations
104
-
105
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
106
- -->
107
-
108
- <!--
109
- ### Recommendations
110
-
111
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
112
- -->
113
-
114
- ## Training Details
115
-
116
- ### Training Dataset
117
-
118
- #### Unnamed Dataset
119
-
120
- * Size: 500,000 training samples
121
- * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
122
- * Approximate statistics based on the first 1000 samples:
123
- | | sentence_0 | sentence_1 | label |
124
- |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------|
125
- | type | string | string | float |
126
- | details | <ul><li>min: 4 tokens</li><li>mean: 31.75 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 31.75 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
127
- * Samples:
128
- | sentence_0 | sentence_1 | label |
129
- |:--------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
130
- | <code>
 Súd: Okresný súd Námestovo
 Spisová značka: 5C/265/2015
 Identifikačné číslo súdneho spisu: 5815205480
 Dátum vydania rozhodnutia: 02.</code> | <code>
 Súd: Okresný súd Námestovo
 Spisová značka: 5C/265/2015
 Identifikačné číslo súdneho spisu: 5815205480
 Dátum vydania rozhodnutia: 02.</code> | <code>1.0</code> |
131
- | <code>06.</code> | <code>06.</code> | <code>1.0</code> |
132
- | <code>2016
 Meno a priezvisko sudcu, VSÚ: JUDr.</code> | <code>2016
 Meno a priezvisko sudcu, VSÚ: JUDr.</code> | <code>1.0</code> |
133
- * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
134
- ```json
135
- {
136
- "loss_fct": "torch.nn.modules.loss.MSELoss"
137
- }
138
- ```
139
-
140
- ### Training Hyperparameters
141
- #### Non-Default Hyperparameters
142
-
143
- - `per_device_train_batch_size`: 4
144
- - `per_device_eval_batch_size`: 4
145
- - `num_train_epochs`: 1
146
- - `fp16`: True
147
- - `multi_dataset_batch_sampler`: round_robin
148
-
149
- #### All Hyperparameters
150
- <details><summary>Click to expand</summary>
151
-
152
- - `overwrite_output_dir`: False
153
- - `do_predict`: False
154
- - `eval_strategy`: no
155
- - `prediction_loss_only`: True
156
- - `per_device_train_batch_size`: 4
157
- - `per_device_eval_batch_size`: 4
158
- - `per_gpu_train_batch_size`: None
159
- - `per_gpu_eval_batch_size`: None
160
- - `gradient_accumulation_steps`: 1
161
- - `eval_accumulation_steps`: None
162
- - `torch_empty_cache_steps`: None
163
- - `learning_rate`: 5e-05
164
- - `weight_decay`: 0.0
165
- - `adam_beta1`: 0.9
166
- - `adam_beta2`: 0.999
167
- - `adam_epsilon`: 1e-08
168
- - `max_grad_norm`: 1
169
- - `num_train_epochs`: 1
170
- - `max_steps`: -1
171
- - `lr_scheduler_type`: linear
172
- - `lr_scheduler_kwargs`: {}
173
- - `warmup_ratio`: 0.0
174
- - `warmup_steps`: 0
175
- - `log_level`: passive
176
- - `log_level_replica`: warning
177
- - `log_on_each_node`: True
178
- - `logging_nan_inf_filter`: True
179
- - `save_safetensors`: True
180
- - `save_on_each_node`: False
181
- - `save_only_model`: False
182
- - `restore_callback_states_from_checkpoint`: False
183
- - `no_cuda`: False
184
- - `use_cpu`: False
185
- - `use_mps_device`: False
186
- - `seed`: 42
187
- - `data_seed`: None
188
- - `jit_mode_eval`: False
189
- - `use_ipex`: False
190
- - `bf16`: False
191
- - `fp16`: True
192
- - `fp16_opt_level`: O1
193
- - `half_precision_backend`: auto
194
- - `bf16_full_eval`: False
195
- - `fp16_full_eval`: False
196
- - `tf32`: None
197
- - `local_rank`: 0
198
- - `ddp_backend`: None
199
- - `tpu_num_cores`: None
200
- - `tpu_metrics_debug`: False
201
- - `debug`: []
202
- - `dataloader_drop_last`: False
203
- - `dataloader_num_workers`: 0
204
- - `dataloader_prefetch_factor`: None
205
- - `past_index`: -1
206
- - `disable_tqdm`: False
207
- - `remove_unused_columns`: True
208
- - `label_names`: None
209
- - `load_best_model_at_end`: False
210
- - `ignore_data_skip`: False
211
- - `fsdp`: []
212
- - `fsdp_min_num_params`: 0
213
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
214
- - `tp_size`: 0
215
- - `fsdp_transformer_layer_cls_to_wrap`: None
216
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
217
- - `deepspeed`: None
218
- - `label_smoothing_factor`: 0.0
219
- - `optim`: adamw_torch
220
- - `optim_args`: None
221
- - `adafactor`: False
222
- - `group_by_length`: False
223
- - `length_column_name`: length
224
- - `ddp_find_unused_parameters`: None
225
- - `ddp_bucket_cap_mb`: None
226
- - `ddp_broadcast_buffers`: False
227
- - `dataloader_pin_memory`: True
228
- - `dataloader_persistent_workers`: False
229
- - `skip_memory_metrics`: True
230
- - `use_legacy_prediction_loop`: False
231
- - `push_to_hub`: False
232
- - `resume_from_checkpoint`: None
233
- - `hub_model_id`: None
234
- - `hub_strategy`: every_save
235
- - `hub_private_repo`: None
236
- - `hub_always_push`: False
237
- - `gradient_checkpointing`: False
238
- - `gradient_checkpointing_kwargs`: None
239
- - `include_inputs_for_metrics`: False
240
- - `include_for_metrics`: []
241
- - `eval_do_concat_batches`: True
242
- - `fp16_backend`: auto
243
- - `push_to_hub_model_id`: None
244
- - `push_to_hub_organization`: None
245
- - `mp_parameters`:
246
- - `auto_find_batch_size`: False
247
- - `full_determinism`: False
248
- - `torchdynamo`: None
249
- - `ray_scope`: last
250
- - `ddp_timeout`: 1800
251
- - `torch_compile`: False
252
- - `torch_compile_backend`: None
253
- - `torch_compile_mode`: None
254
- - `include_tokens_per_second`: False
255
- - `include_num_input_tokens_seen`: False
256
- - `neftune_noise_alpha`: None
257
- - `optim_target_modules`: None
258
- - `batch_eval_metrics`: False
259
- - `eval_on_start`: False
260
- - `use_liger_kernel`: False
261
- - `eval_use_gather_object`: False
262
- - `average_tokens_across_devices`: False
263
- - `prompts`: None
264
- - `batch_sampler`: batch_sampler
265
- - `multi_dataset_batch_sampler`: round_robin
266
-
267
- </details>
268
-
269
- ### Training Logs
270
- <details><summary>Click to expand</summary>
271
-
272
- | Epoch | Step | Training Loss |
273
- |:-----:|:-----:|:-------------:|
274
- | 0.008 | 500 | 0.0089 |
275
- | 0.016 | 1000 | 0.0001 |
276
- | 0.024 | 1500 | 0.0 |
277
- | 0.032 | 2000 | 0.0 |
278
- | 0.04 | 2500 | 0.0 |
279
- | 0.048 | 3000 | 0.0 |
280
- | 0.056 | 3500 | 0.0 |
281
- | 0.064 | 4000 | 0.0 |
282
- | 0.072 | 4500 | 0.0 |
283
- | 0.08 | 5000 | 0.0 |
284
- | 0.088 | 5500 | 0.0 |
285
- | 0.096 | 6000 | 0.0 |
286
- | 0.104 | 6500 | 0.0 |
287
- | 0.112 | 7000 | 0.0 |
288
- | 0.12 | 7500 | 0.0 |
289
- | 0.128 | 8000 | 0.0 |
290
- | 0.136 | 8500 | 0.0 |
291
- | 0.144 | 9000 | 0.0 |
292
- | 0.152 | 9500 | 0.0 |
293
- | 0.16 | 10000 | 0.0 |
294
- | 0.168 | 10500 | 0.0 |
295
- | 0.176 | 11000 | 0.0 |
296
- | 0.184 | 11500 | 0.0 |
297
- | 0.192 | 12000 | 0.0 |
298
- | 0.2 | 12500 | 0.0 |
299
- | 0.208 | 13000 | 0.0 |
300
- | 0.216 | 13500 | 0.0 |
301
- | 0.224 | 14000 | 0.0 |
302
- | 0.232 | 14500 | 0.0 |
303
- | 0.24 | 15000 | 0.0 |
304
- | 0.248 | 15500 | 0.0 |
305
- | 0.256 | 16000 | 0.0 |
306
- | 0.264 | 16500 | 0.0 |
307
- | 0.272 | 17000 | 0.0 |
308
- | 0.28 | 17500 | 0.0 |
309
- | 0.288 | 18000 | 0.0 |
310
- | 0.296 | 18500 | 0.0 |
311
- | 0.304 | 19000 | 0.0 |
312
- | 0.312 | 19500 | 0.0 |
313
- | 0.32 | 20000 | 0.0 |
314
- | 0.328 | 20500 | 0.0 |
315
- | 0.336 | 21000 | 0.0 |
316
- | 0.344 | 21500 | 0.0 |
317
- | 0.352 | 22000 | 0.0 |
318
- | 0.36 | 22500 | 0.0 |
319
- | 0.368 | 23000 | 0.0 |
320
- | 0.376 | 23500 | 0.0 |
321
- | 0.384 | 24000 | 0.0 |
322
- | 0.392 | 24500 | 0.0 |
323
- | 0.4 | 25000 | 0.0 |
324
- | 0.408 | 25500 | 0.0 |
325
- | 0.416 | 26000 | 0.0 |
326
- | 0.424 | 26500 | 0.0 |
327
- | 0.432 | 27000 | 0.0 |
328
- | 0.44 | 27500 | 0.0 |
329
- | 0.448 | 28000 | 0.0 |
330
- | 0.456 | 28500 | 0.0 |
331
- | 0.464 | 29000 | 0.0 |
332
- | 0.472 | 29500 | 0.0 |
333
- | 0.48 | 30000 | 0.0 |
334
- | 0.488 | 30500 | 0.0 |
335
- | 0.496 | 31000 | 0.0 |
336
- | 0.504 | 31500 | 0.0 |
337
- | 0.512 | 32000 | 0.0 |
338
- | 0.52 | 32500 | 0.0 |
339
- | 0.528 | 33000 | 0.0 |
340
- | 0.536 | 33500 | 0.0 |
341
- | 0.544 | 34000 | 0.0 |
342
- | 0.552 | 34500 | 0.0 |
343
- | 0.56 | 35000 | 0.0 |
344
- | 0.568 | 35500 | 0.0 |
345
- | 0.576 | 36000 | 0.0 |
346
- | 0.584 | 36500 | 0.0 |
347
- | 0.592 | 37000 | 0.0 |
348
- | 0.6 | 37500 | 0.0 |
349
- | 0.608 | 38000 | 0.0 |
350
- | 0.616 | 38500 | 0.0 |
351
- | 0.624 | 39000 | 0.0 |
352
- | 0.632 | 39500 | 0.0 |
353
- | 0.64 | 40000 | 0.0 |
354
- | 0.648 | 40500 | 0.0 |
355
- | 0.656 | 41000 | 0.0 |
356
- | 0.664 | 41500 | 0.0 |
357
- | 0.672 | 42000 | 0.0 |
358
- | 0.68 | 42500 | 0.0 |
359
- | 0.688 | 43000 | 0.0 |
360
- | 0.696 | 43500 | 0.0 |
361
- | 0.704 | 44000 | 0.0 |
362
- | 0.712 | 44500 | 0.0 |
363
- | 0.72 | 45000 | 0.0 |
364
- | 0.728 | 45500 | 0.0 |
365
- | 0.736 | 46000 | 0.0 |
366
- | 0.744 | 46500 | 0.0 |
367
- | 0.752 | 47000 | 0.0 |
368
- | 0.76 | 47500 | 0.0 |
369
- | 0.768 | 48000 | 0.0 |
370
- | 0.776 | 48500 | 0.0 |
371
- | 0.784 | 49000 | 0.0 |
372
- | 0.792 | 49500 | 0.0 |
373
- | 0.8 | 50000 | 0.0 |
374
- | 0.808 | 50500 | 0.0 |
375
- | 0.816 | 51000 | 0.0 |
376
- | 0.824 | 51500 | 0.0 |
377
- | 0.832 | 52000 | 0.0 |
378
- | 0.84 | 52500 | 0.0 |
379
- | 0.848 | 53000 | 0.0 |
380
- | 0.856 | 53500 | 0.0 |
381
- | 0.864 | 54000 | 0.0 |
382
- | 0.872 | 54500 | 0.0 |
383
- | 0.88 | 55000 | 0.0 |
384
- | 0.888 | 55500 | 0.0 |
385
- | 0.896 | 56000 | 0.0 |
386
- | 0.904 | 56500 | 0.0 |
387
- | 0.912 | 57000 | 0.0 |
388
- | 0.92 | 57500 | 0.0 |
389
- | 0.928 | 58000 | 0.0 |
390
- | 0.936 | 58500 | 0.0 |
391
- | 0.944 | 59000 | 0.0 |
392
- | 0.952 | 59500 | 0.0 |
393
- | 0.96 | 60000 | 0.0 |
394
- | 0.968 | 60500 | 0.0 |
395
- | 0.976 | 61000 | 0.0 |
396
- | 0.984 | 61500 | 0.0 |
397
- | 0.992 | 62000 | 0.0 |
398
- | 1.0 | 62500 | 0.0 |
399
-
400
- </details>
401
-
402
- ### Framework Versions
403
- - Python: 3.9.13
404
- - Sentence Transformers: 4.1.0
405
- - Transformers: 4.51.3
406
- - PyTorch: 2.6.0+cu124
407
- - Accelerate: 1.6.0
408
- - Datasets: 3.5.0
409
- - Tokenizers: 0.21.1
410
-
411
- ## Citation
412
-
413
- ### BibTeX
414
-
415
- #### Sentence Transformers
416
- ```bibtex
417
- @inproceedings{reimers-2019-sentence-bert,
418
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
419
- author = "Reimers, Nils and Gurevych, Iryna",
420
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
421
- month = "11",
422
- year = "2019",
423
- publisher = "Association for Computational Linguistics",
424
- url = "https://arxiv.org/abs/1908.10084",
425
- }
426
- ```
427
-
428
- <!--
429
- ## Glossary
430
-
431
- *Clearly define terms in order to be accessible across audiences.*
432
- -->
433
-
434
- <!--
435
- ## Model Card Authors
436
-
437
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
438
- -->
439
-
440
- <!--
441
- ## Model Card Contact
442
-
443
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
444
- -->
 
35
  - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
36
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
37