bakrianoo's picture
Add new SentenceTransformer model
9eb5073 verified
|
raw
history blame
80.8 kB
metadata
base_model: aubmindlab/bert-base-arabertv02
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2279719
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: ما هو علاج الفطريات الجلدية؟
    sentences:
      - >-
        كيف سيؤثر ذلك على الطلاب الهنود الذين يدرسون أو يعملون في الولايات
        المتحدة إذا أصبح ترامب رئيساً؟
      - كيف يمكنك معالجة الأكزيما بشكل طبيعي؟
      - كيف تعالج الفطريات الجلدية؟
  - source_sentence: >-
      So Eric had an initial design idea for a robot, but we didn't have all the
      parts figured out, so we did what anybody would do in our situation: we
      asked the Internet for help.
    sentences:
      - >-
        وهكذا أول شيء فعلناه هو , بمجرد أن التسلسل خرج من الماكينات , نشرناه على
        الإنترنت .
      - >-
        وكانت لدى "إريك" فكرة مبدئية لصناعة روبوت، ولكن لم يكن لدينا فكرة عن
        القطع التي نحتاجها لذلك قمنا بما يمكن أن يقوم به أي شخص بوضعنا قمنا بطلب
        المساعدة عبر الإنترنت
      - >-
        ما هي مواقع الويب التي يجب اتباعها لتوصيات الأسهم خلال اليوم في سوق
        الأسهم الهندية؟
  - source_sentence: Well, guess what? In England, it's seven per 100,000.
    sentences:
      - عندما نكون أطفالًا، نتعلم الضحك، ونتعلم الضحك بشكل أساسي في اللعب.
      - هذا ليس 10000 دولارا، إنه بالعملة المحلية .
      - خمنوا ماذا؟ في إنكلترا، النسبة سبع في كل 000 100.
  - source_sentence: ما هي العوامل الحيوية وغير الحيوية؟ كيف تختلف عن بعضها البعض؟
    sentences:
      - ما هي بعض النصائح لتعلم لغة بايثون؟
      - كما تم تسجيل نتائج إيجابية لثلاثة أيام متتالية.
      - كيف تقارن العوامل الحيوية والعوامل غير الحيوية وتتناقض؟
  - source_sentence: >-
      And the piece of art he bought at the yard sale is hanging in his
      classroom; he's a teacher now.
    sentences:
      - هل الرياضيات لغة أخرى؟
      - تدريجيا، أصبحت هذه العصافير بمثابة معلمين له.
      - >-
        أما اللوحات التي أشتراها منّي فهي معلّقة الآن في غرفة الصف خاصّته؛ فقد
        أصبح مدرّساً.
model-index:
  - name: SentenceTransformer based on aubmindlab/bert-base-arabertv02
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts dev 768
          type: sts-dev-768
        metrics:
          - type: pearson_cosine
            value: 0.8410341962006318
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8422963798504417
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.8119358373898954
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8260328397910858
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.8138598024349573
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.831707795171752
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.8371709698109359
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8389681969788781
            name: Spearman Dot
          - type: pearson_max
            value: 0.8410341962006318
            name: Pearson Max
          - type: spearman_max
            value: 0.8422963798504417
            name: Spearman Max
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts dev 512
          type: sts-dev-512
        metrics:
          - type: pearson_cosine
            value: 0.8408199016320912
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8415754271206667
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.8114852653680014
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8231951698466913
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.8125911836775428
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8267107276111355
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.8357223021732401
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8377004761329118
            name: Spearman Dot
          - type: pearson_max
            value: 0.8408199016320912
            name: Pearson Max
          - type: spearman_max
            value: 0.8415754271206667
            name: Spearman Max

SentenceTransformer based on aubmindlab/bert-base-arabertv02

This is a sentence-transformers model finetuned from aubmindlab/bert-base-arabertv02. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: aubmindlab/bert-base-arabertv02
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("silma-ai/silma-embeddding-matryoshka-0.1")
# Run inference
sentences = [
    "And the piece of art he bought at the yard sale is hanging in his classroom; he's a teacher now.",
    'أما اللوحات التي أشتراها منّي فهي معلّقة الآن في غرفة الصف خاصّته؛ فقد أصبح مدرّساً.',
    'تدريجيا، أصبحت هذه العصافير بمثابة معلمين له.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.841
spearman_cosine 0.8423
pearson_manhattan 0.8119
spearman_manhattan 0.826
pearson_euclidean 0.8139
spearman_euclidean 0.8317
pearson_dot 0.8372
spearman_dot 0.839
pearson_max 0.841
spearman_max 0.8423

Semantic Similarity

Metric Value
pearson_cosine 0.8408
spearman_cosine 0.8416
pearson_manhattan 0.8115
spearman_manhattan 0.8232
pearson_euclidean 0.8126
spearman_euclidean 0.8267
pearson_dot 0.8357
spearman_dot 0.8377
pearson_max 0.8408
spearman_max 0.8416

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,279,719 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 19.51 tokens
    • max: 139 tokens
    • min: 4 tokens
    • mean: 12.47 tokens
    • max: 59 tokens
    • min: 4 tokens
    • mean: 12.13 tokens
    • max: 72 tokens
  • Samples:
    anchor positive negative
    كيف أصنع صاروخاً؟ كيف أصنع صاروخاً صناعياً؟ كيف أصنع أول روبوت لي؟
    فتاة شابة تجلس على طاولة مع وعاء على رأسها فتاة صغيرة لديها وعاء على رأسها رجل يأكل الحبوب في سيارته
    كيف يمكنني الانضمام إلى الجيش الهندي بعد البكالوريوس؟ كيف تنضم للجيش الهندي بعد الهندسة؟ كيف لي أن أعرف ماذا أريد أن أفعل في حياتي؟
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512
        ],
        "matryoshka_weights": [
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 600 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 600 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 19.5 tokens
    • max: 146 tokens
    • min: 4 tokens
    • mean: 12.67 tokens
    • max: 43 tokens
    • min: 4 tokens
    • mean: 12.15 tokens
    • max: 41 tokens
  • Samples:
    anchor positive negative
    And this explanation represents great progress. وهذا التفسير يمثل تقدماً عظيماً وأظهرت هذا الإتجاه المذهل.
    ثلاثة رجال يلعبون كرة السلة ثلاثة رجال يلعبون لعبة كرة السلة رجلين يرتديان ملابس غريبة يقفزان على ملعب كرة السلة
    الرجل جالس رجل يرتدي قميصاً أحمر يعزف الطبول. رجل في قميص رمادي يقف.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512
        ],
        "matryoshka_weights": [
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 50
  • per_device_eval_batch_size: 10
  • learning_rate: 1e-05
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 50
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev-768_spearman_cosine sts-dev-512_spearman_cosine
0.0044 50 - 0.7749 0.7784 0.7748
0.0088 100 - 0.6231 0.7854 0.7809
0.0132 150 - 0.5326 0.8028 0.7992
0.0175 200 - 0.4880 0.8103 0.8047
0.0219 250 1.1802 0.4398 0.8084 0.8043
0.0263 300 - 0.4203 0.8108 0.8058
0.0307 350 - 0.3880 0.8134 0.8075
0.0351 400 - 0.3998 0.8180 0.8145
0.0395 450 - 0.3840 0.8154 0.8114
0.0439 500 0.7483 0.3804 0.8105 0.8056
0.0483 550 - 0.3695 0.8147 0.8103
0.0526 600 - 0.3649 0.8145 0.8101
0.0570 650 - 0.3494 0.8192 0.8157
0.0614 700 - 0.3437 0.8159 0.8106
0.0658 750 0.6561 0.3302 0.8158 0.8104
0.0702 800 - 0.3359 0.8204 0.8174
0.0746 850 - 0.3446 0.8119 0.8094
0.0790 900 - 0.3419 0.8265 0.8252
0.0833 950 - 0.3197 0.8177 0.8141
0.0877 1000 0.6178 0.3250 0.8213 0.8185
0.0921 1050 - 0.3017 0.8161 0.8127
0.0965 1100 - 0.3058 0.8232 0.8180
0.1009 1150 - 0.3066 0.8236 0.8193
0.1053 1200 - 0.2924 0.8275 0.8237
0.1097 1250 0.5633 0.3096 0.8206 0.8173
0.1141 1300 - 0.3009 0.8299 0.8277
0.1184 1350 - 0.3067 0.8158 0.8111
0.1228 1400 - 0.2898 0.8215 0.8180
0.1272 1450 - 0.2810 0.8272 0.8261
0.1316 1500 0.5337 0.2810 0.8228 0.8187
0.1360 1550 - 0.2772 0.8167 0.8139
0.1404 1600 - 0.2772 0.8228 0.8194
0.1448 1650 - 0.2751 0.8193 0.8153
0.1491 1700 - 0.2579 0.8182 0.8147
0.1535 1750 0.5154 0.2542 0.8199 0.8166
0.1579 1800 - 0.2607 0.8243 0.8224
0.1623 1850 - 0.2595 0.8280 0.8254
0.1667 1900 - 0.2612 0.8272 0.8255
0.1711 1950 - 0.2644 0.8273 0.8242
0.1755 2000 0.4838 0.2618 0.8276 0.8246
0.1799 2050 - 0.2553 0.8219 0.8200
0.1842 2100 - 0.2581 0.8232 0.8217
0.1886 2150 - 0.2620 0.8254 0.8232
0.1930 2200 - 0.2627 0.8235 0.8193
0.1974 2250 0.486 0.2597 0.8170 0.8142
0.2018 2300 - 0.2605 0.8261 0.8231
0.2062 2350 - 0.2584 0.8252 0.8222
0.2106 2400 - 0.2663 0.8247 0.8228
0.2149 2450 - 0.2527 0.8285 0.8280
0.2193 2500 0.4523 0.2487 0.8291 0.8270
0.2237 2550 - 0.2524 0.8257 0.8244
0.2281 2600 - 0.2513 0.8228 0.8210
0.2325 2650 - 0.2531 0.8287 0.8265
0.2369 2700 - 0.2510 0.8224 0.8198
0.2413 2750 0.4522 0.2523 0.8275 0.8260
0.2457 2800 - 0.2563 0.8301 0.8278
0.2500 2850 - 0.2531 0.8242 0.8242
0.2544 2900 - 0.2527 0.8268 0.8268
0.2588 2950 - 0.2465 0.8228 0.8223
0.2632 3000 0.4472 0.2422 0.8263 0.8237
0.2676 3050 - 0.2484 0.8223 0.8195
0.2720 3100 - 0.2469 0.8209 0.8206
0.2764 3150 - 0.2419 0.8283 0.8281
0.2808 3200 - 0.2370 0.8303 0.8286
0.2851 3250 0.4499 0.2374 0.8293 0.8275
0.2895 3300 - 0.2340 0.8255 0.8255
0.2939 3350 - 0.2461 0.8277 0.8292
0.2983 3400 - 0.2421 0.8320 0.8307
0.3027 3450 - 0.2366 0.8286 0.8281
0.3071 3500 0.4305 0.2389 0.8312 0.8293
0.3115 3550 - 0.2360 0.8305 0.8310
0.3158 3600 - 0.2313 0.8271 0.8256
0.3202 3650 - 0.2182 0.8231 0.8197
0.3246 3700 - 0.2220 0.8274 0.8246
0.3290 3750 0.4221 0.2305 0.8301 0.8292
0.3334 3800 - 0.2244 0.8285 0.8265
0.3378 3850 - 0.2355 0.8349 0.8331
0.3422 3900 - 0.2256 0.8355 0.8330
0.3466 3950 - 0.2273 0.8330 0.8299
0.3509 4000 0.4203 0.2334 0.8304 0.8275
0.3553 4050 - 0.2223 0.8323 0.8305
0.3597 4100 - 0.2314 0.8323 0.8299
0.3641 4150 - 0.2196 0.8272 0.8244
0.3685 4200 - 0.2275 0.8342 0.8353
0.3729 4250 0.4039 0.2209 0.8348 0.8333
0.3773 4300 - 0.2152 0.8314 0.8307
0.3816 4350 - 0.2115 0.8353 0.8325
0.3860 4400 - 0.2195 0.8347 0.8310
0.3904 4450 - 0.2110 0.8293 0.8264
0.3948 4500 0.4065 0.2115 0.8321 0.8293
0.3992 4550 - 0.2139 0.8312 0.8286
0.4036 4600 - 0.2145 0.8319 0.8285
0.4080 4650 - 0.2127 0.8281 0.8255
0.4124 4700 - 0.2122 0.8292 0.8268
0.4167 4750 0.4019 0.2160 0.8354 0.8329
0.4211 4800 - 0.2069 0.8296 0.8258
0.4255 4850 - 0.2106 0.8362 0.8335
0.4299 4900 - 0.2130 0.8345 0.8321
0.4343 4950 - 0.2080 0.8307 0.8277
0.4387 5000 0.3941 0.2184 0.8394 0.8370
0.4431 5050 - 0.2061 0.8334 0.8325
0.4474 5100 - 0.2092 0.8318 0.8307
0.4518 5150 - 0.2108 0.8319 0.8289
0.4562 5200 - 0.2046 0.8359 0.8337
0.4606 5250 0.3873 0.1990 0.8327 0.8305
0.4650 5300 - 0.2007 0.8332 0.8305
0.4694 5350 - 0.1989 0.8284 0.8247
0.4738 5400 - 0.2117 0.8363 0.8346
0.4782 5450 - 0.2036 0.8329 0.8296
0.4825 5500 0.3808 0.1999 0.8341 0.8295
0.4869 5550 - 0.1998 0.8336 0.8300
0.4913 5600 - 0.2040 0.8348 0.8331
0.4957 5650 - 0.2068 0.8367 0.8346
0.5001 5700 - 0.1947 0.8333 0.8305
0.5045 5750 0.3779 0.1969 0.8352 0.8329
0.5089 5800 - 0.2028 0.8372 0.8369
0.5132 5850 - 0.2029 0.8336 0.8319
0.5176 5900 - 0.2029 0.8317 0.8309
0.5220 5950 - 0.2059 0.8270 0.8270
0.5264 6000 0.3704 0.1997 0.8263 0.8236
0.5308 6050 - 0.2001 0.8280 0.8252
0.5352 6100 - 0.1985 0.8275 0.8241
0.5396 6150 - 0.1976 0.8281 0.8281
0.5440 6200 - 0.1987 0.8270 0.8247
0.5483 6250 0.3722 0.2045 0.8320 0.8303
0.5527 6300 - 0.2013 0.8292 0.8278
0.5571 6350 - 0.2007 0.8302 0.8279
0.5615 6400 - 0.1949 0.8297 0.8274
0.5659 6450 - 0.2037 0.8335 0.8313
0.5703 6500 0.3638 0.2060 0.8316 0.8280
0.5747 6550 - 0.2030 0.8372 0.8348
0.5790 6600 - 0.1982 0.8317 0.8295
0.5834 6650 - 0.2075 0.8324 0.8325
0.5878 6700 - 0.2014 0.8306 0.8284
0.5922 6750 0.3581 0.1983 0.8360 0.8344
0.5966 6800 - 0.2007 0.8337 0.8313
0.6010 6850 - 0.2003 0.8349 0.8338
0.6054 6900 - 0.2018 0.8313 0.8305
0.6098 6950 - 0.1978 0.8323 0.8307
0.6141 7000 0.3596 0.1991 0.8370 0.8340
0.6185 7050 - 0.1963 0.8330 0.8302
0.6229 7100 - 0.1918 0.8334 0.8320
0.6273 7150 - 0.2008 0.8338 0.8327
0.6317 7200 - 0.1973 0.8320 0.8295
0.6361 7250 0.3614 0.1891 0.8339 0.8322
0.6405 7300 - 0.1961 0.8355 0.8332
0.6448 7350 - 0.1910 0.8322 0.8304
0.6492 7400 - 0.1926 0.8343 0.8331
0.6536 7450 - 0.1935 0.8310 0.8292
0.6580 7500 0.3513 0.1969 0.8337 0.8346
0.6624 7550 - 0.1891 0.8331 0.8311
0.6668 7600 - 0.1932 0.8369 0.8341
0.6712 7650 - 0.2041 0.8370 0.8357
0.6756 7700 - 0.1946 0.8335 0.8314
0.6799 7750 0.3426 0.1955 0.8364 0.8330
0.6843 7800 - 0.1940 0.8316 0.8307
0.6887 7850 - 0.1893 0.8323 0.8322
0.6931 7900 - 0.1839 0.8296 0.8286
0.6975 7950 - 0.1895 0.8321 0.8296
0.7019 8000 0.3406 0.1901 0.8277 0.8263
0.7063 8050 - 0.1835 0.8331 0.8284
0.7107 8100 - 0.1847 0.8359 0.8342
0.7150 8150 - 0.1892 0.8362 0.8348
0.7194 8200 - 0.1775 0.8339 0.8305
0.7238 8250 0.3357 0.1921 0.8359 0.8340
0.7282 8300 - 0.1881 0.8369 0.8344
0.7326 8350 - 0.1891 0.8371 0.8363
0.7370 8400 - 0.1880 0.8394 0.8364
0.7414 8450 - 0.1892 0.8348 0.8306
0.7457 8500 0.327 0.1868 0.8388 0.8353
0.7501 8550 - 0.1815 0.8378 0.8352
0.7545 8600 - 0.1877 0.8398 0.8370
0.7589 8650 - 0.1878 0.8392 0.8378
0.7633 8700 - 0.1778 0.8330 0.8304
0.7677 8750 0.3288 0.1791 0.8390 0.8360
0.7721 8800 - 0.1803 0.8298 0.8270
0.7765 8850 - 0.1803 0.8358 0.8323
0.7808 8900 - 0.1832 0.8330 0.8322
0.7852 8950 - 0.1767 0.8316 0.8286
0.7896 9000 0.329 0.1808 0.8283 0.8254
0.7940 9050 - 0.1842 0.8331 0.8293
0.7984 9100 - 0.1750 0.8304 0.8275
0.8028 9150 - 0.1779 0.8299 0.8270
0.8072 9200 - 0.1799 0.8332 0.8332
0.8115 9250 0.3283 0.1872 0.8399 0.8371
0.8159 9300 - 0.1842 0.8364 0.8352
0.8203 9350 - 0.1785 0.8415 0.8382
0.8247 9400 - 0.1822 0.8432 0.8407
0.8291 9450 - 0.1745 0.8380 0.8364
0.8335 9500 0.3271 0.1745 0.8374 0.8352
0.8379 9550 - 0.1746 0.8363 0.8332
0.8423 9600 - 0.1776 0.8391 0.8374
0.8466 9650 - 0.1760 0.8379 0.8353
0.8510 9700 - 0.1806 0.8360 0.8335
0.8554 9750 0.3309 0.1822 0.8368 0.8337
0.8598 9800 - 0.1765 0.8366 0.8336
0.8642 9850 - 0.1766 0.8353 0.8323
0.8686 9900 - 0.1698 0.8353 0.8315
0.8730 9950 - 0.1715 0.8378 0.8338
0.8773 10000 0.318 0.1782 0.8396 0.8357
0.8817 10050 - 0.1727 0.8382 0.8368
0.8861 10100 - 0.1740 0.8356 0.8330
0.8905 10150 - 0.1723 0.8347 0.8319
0.8949 10200 - 0.1656 0.8336 0.8314
0.8993 10250 0.3284 0.1742 0.8288 0.8264
0.9037 10300 - 0.1679 0.8315 0.8296
0.9081 10350 - 0.1694 0.8325 0.8296
0.9124 10400 - 0.1723 0.8319 0.8305
0.9168 10450 - 0.1638 0.8340 0.8310
0.9212 10500 0.313 0.1730 0.8371 0.8368
0.9256 10550 - 0.1639 0.8351 0.8327
0.9300 10600 - 0.1634 0.8379 0.8350
0.9344 10650 - 0.1745 0.8353 0.8340
0.9388 10700 - 0.1731 0.8349 0.8346
0.9431 10750 0.3145 0.1668 0.8333 0.8314
0.9475 10800 - 0.1653 0.8351 0.8338
0.9519 10850 - 0.1655 0.8401 0.8390
0.9563 10900 - 0.1708 0.8376 0.8360
0.9607 10950 - 0.1740 0.8382 0.8364
0.9651 11000 0.3002 0.1714 0.8401 0.8382
0.9695 11050 - 0.1647 0.8411 0.8393
0.9739 11100 - 0.1701 0.8418 0.8396
0.9782 11150 - 0.1665 0.8394 0.8379
0.9826 11200 - 0.1652 0.8377 0.8376
0.9870 11250 0.3094 0.1665 0.8408 0.8397
0.9914 11300 - 0.1689 0.8412 0.8393
0.9958 11350 - 0.1674 0.8400 0.8374
1.0002 11400 - 0.1694 0.8395 0.8376
1.0046 11450 - 0.1697 0.8434 0.8419
1.0089 11500 0.3004 0.1640 0.8399 0.8388
1.0133 11550 - 0.1731 0.8445 0.8426
1.0177 11600 - 0.1618 0.8430 0.8389
1.0221 11650 - 0.1646 0.8414 0.8377
1.0265 11700 - 0.1679 0.8435 0.8401
1.0309 11750 0.2984 0.1646 0.8413 0.8385
1.0353 11800 - 0.1797 0.8465 0.8432
1.0397 11850 - 0.1758 0.8393 0.8390
1.0440 11900 - 0.1690 0.8401 0.8379
1.0484 11950 - 0.1735 0.8423 0.8404
1.0528 12000 0.2896 0.1719 0.8384 0.8367
1.0572 12050 - 0.1759 0.8420 0.8403
1.0616 12100 - 0.1659 0.8360 0.8340
1.0660 12150 - 0.1645 0.8368 0.8362
1.0704 12200 - 0.1601 0.8380 0.8355
1.0747 12250 0.2954 0.1711 0.8406 0.8387
1.0791 12300 - 0.1691 0.8389 0.8370
1.0835 12350 - 0.1721 0.8397 0.8385
1.0879 12400 - 0.1689 0.8379 0.8351
1.0923 12450 - 0.1663 0.8424 0.8402
1.0967 12500 0.2864 0.1672 0.8418 0.8403
1.1011 12550 - 0.1689 0.8389 0.8386
1.1055 12600 - 0.1664 0.8410 0.8402
1.1098 12650 - 0.1685 0.8387 0.8376
1.1142 12700 - 0.1715 0.8419 0.8402
1.1186 12750 0.2745 0.1607 0.8373 0.8336
1.1230 12800 - 0.1620 0.8388 0.8379
1.1274 12850 - 0.1623 0.8417 0.8396
1.1318 12900 - 0.1589 0.8360 0.8342
1.1362 12950 - 0.1567 0.8300 0.8298
1.1406 13000 0.2768 0.1557 0.8406 0.8365
1.1449 13050 - 0.1581 0.8389 0.8363
1.1493 13100 - 0.1611 0.8399 0.8366
1.1537 13150 - 0.1583 0.8358 0.8348
1.1581 13200 - 0.1619 0.8405 0.8387
1.1625 13250 0.2737 0.1567 0.8373 0.8339
1.1669 13300 - 0.1642 0.8393 0.8374
1.1713 13350 - 0.1646 0.8404 0.8376
1.1756 13400 - 0.1601 0.8419 0.8402
1.1800 13450 - 0.1648 0.8412 0.8391
1.1844 13500 0.2627 0.1635 0.8403 0.8403
1.1888 13550 - 0.1662 0.8427 0.8407
1.1932 13600 - 0.1687 0.8381 0.8368
1.1976 13650 - 0.1693 0.8366 0.8365
1.2020 13700 - 0.1665 0.8410 0.8397
1.2064 13750 0.2738 0.1665 0.8373 0.8360
1.2107 13800 - 0.1667 0.8388 0.8389
1.2151 13850 - 0.1674 0.8455 0.8413
1.2195 13900 - 0.1704 0.8419 0.8382
1.2239 13950 - 0.1654 0.8417 0.8398
1.2283 14000 0.2563 0.1610 0.8414 0.8403
1.2327 14050 - 0.1625 0.8416 0.8380
1.2371 14100 - 0.1705 0.8411 0.8400
1.2414 14150 - 0.1628 0.8400 0.8384
1.2458 14200 - 0.1667 0.8448 0.8435
1.2502 14250 0.2693 0.1651 0.8406 0.8396
1.2546 14300 - 0.1673 0.8404 0.8388
1.2590 14350 - 0.1630 0.8392 0.8375
1.2634 14400 - 0.1633 0.8413 0.8403
1.2678 14450 - 0.1636 0.8412 0.8398
1.2722 14500 0.266 0.1613 0.8404 0.8379
1.2765 14550 - 0.1625 0.8392 0.8380
1.2809 14600 - 0.1634 0.8418 0.8397
1.2853 14650 - 0.1689 0.8426 0.8428
1.2897 14700 - 0.1617 0.8410 0.8405
1.2941 14750 0.2643 0.1661 0.8437 0.8417
1.2985 14800 - 0.1629 0.8409 0.8394
1.3029 14850 - 0.1584 0.8413 0.8387
1.3072 14900 - 0.1638 0.8446 0.8433
1.3116 14950 - 0.1644 0.8429 0.8426
1.3160 15000 0.2624 0.1570 0.8391 0.8386
1.3204 15050 - 0.1535 0.8367 0.8348
1.3248 15100 - 0.1591 0.8381 0.8367
1.3292 15150 - 0.1618 0.8421 0.8409
1.3336 15200 - 0.1554 0.8402 0.8381
1.3380 15250 0.2621 0.1595 0.8431 0.8427
1.3423 15300 - 0.1595 0.8447 0.8435
1.3467 15350 - 0.1585 0.8408 0.8394
1.3511 15400 - 0.1635 0.8403 0.8389
1.3555 15450 - 0.1569 0.8453 0.8444
1.3599 15500 0.2552 0.1605 0.8434 0.8412
1.3643 15550 - 0.1542 0.8420 0.8397
1.3687 15600 - 0.1622 0.8456 0.8451
1.3730 15650 - 0.1569 0.8466 0.8443
1.3774 15700 - 0.1550 0.8440 0.8416
1.3818 15750 0.2532 0.1569 0.8459 0.8445
1.3862 15800 - 0.1567 0.8462 0.8451
1.3906 15850 - 0.1504 0.8442 0.8422
1.3950 15900 - 0.1524 0.8437 0.8419
1.3994 15950 - 0.1491 0.8438 0.8413
1.4038 16000 0.265 0.1533 0.8428 0.8406
1.4081 16050 - 0.1492 0.8425 0.8399
1.4125 16100 - 0.1486 0.8410 0.8386
1.4169 16150 - 0.1530 0.8458 0.8433
1.4213 16200 - 0.1535 0.8437 0.8427
1.4257 16250 0.2512 0.1508 0.8453 0.8446
1.4301 16300 - 0.1540 0.8427 0.8411
1.4345 16350 - 0.1513 0.8414 0.8388
1.4388 16400 - 0.1553 0.8464 0.8461
1.4432 16450 - 0.1528 0.8434 0.8412
1.4476 16500 0.2545 0.1522 0.8419 0.8399
1.4520 16550 - 0.1521 0.8423 0.8416
1.4564 16600 - 0.1433 0.8427 0.8410
1.4608 16650 - 0.1500 0.8419 0.8401
1.4652 16700 - 0.1442 0.8425 0.8392
1.4696 16750 0.2549 0.1496 0.8397 0.8376
1.4739 16800 - 0.1556 0.8463 0.8435
1.4783 16850 - 0.1510 0.8458 0.8432
1.4827 16900 - 0.1469 0.8431 0.8423
1.4871 16950 - 0.1481 0.8456 0.8441
1.4915 17000 0.2522 0.1512 0.8456 0.8437
1.4959 17050 - 0.1471 0.8455 0.8430
1.5003 17100 - 0.1397 0.8409 0.8383
1.5046 17150 - 0.1414 0.8427 0.8404
1.5090 17200 - 0.1474 0.8432 0.8420
1.5134 17250 0.2489 0.1499 0.8414 0.8412
1.5178 17300 - 0.1442 0.8390 0.8376
1.5222 17350 - 0.1474 0.8373 0.8370
1.5266 17400 - 0.1435 0.8353 0.8352
1.5310 17450 - 0.1461 0.8380 0.8363
1.5354 17500 0.2493 0.1477 0.8362 0.8353
1.5397 17550 - 0.1503 0.8398 0.8385
1.5441 17600 - 0.1474 0.8372 0.8376
1.5485 17650 - 0.1499 0.8408 0.8390
1.5529 17700 - 0.1501 0.8386 0.8369
1.5573 17750 0.2499 0.1474 0.8367 0.8351
1.5617 17800 - 0.1406 0.8380 0.8362
1.5661 17850 - 0.1457 0.8399 0.8396
1.5705 17900 - 0.1486 0.8409 0.8399
1.5748 17950 - 0.1493 0.8407 0.8397
1.5792 18000 0.2419 0.1490 0.8400 0.8386
1.5836 18050 - 0.1496 0.8403 0.8388
1.5880 18100 - 0.1509 0.8422 0.8401
1.5924 18150 - 0.1513 0.8433 0.8420
1.5968 18200 - 0.1546 0.8420 0.8408
1.6012 18250 0.2458 0.1529 0.8414 0.8398
1.6055 18300 - 0.1580 0.8414 0.8391
1.6099 18350 - 0.1483 0.8389 0.8363
1.6143 18400 - 0.1501 0.8419 0.8405
1.6187 18450 - 0.1488 0.8413 0.8388
1.6231 18500 0.2532 0.1499 0.8418 0.8410
1.6275 18550 - 0.1520 0.8409 0.8408
1.6319 18600 - 0.1521 0.8407 0.8392
1.6363 18650 - 0.1459 0.8402 0.8382
1.6406 18700 - 0.1556 0.8433 0.8427
1.6450 18750 0.24 0.1501 0.8421 0.8410
1.6494 18800 - 0.1485 0.8439 0.8425
1.6538 18850 - 0.1526 0.8412 0.8406
1.6582 18900 - 0.1522 0.8422 0.8425
1.6626 18950 - 0.1456 0.8406 0.8390
1.6670 19000 0.2404 0.1483 0.8412 0.8408
1.6713 19050 - 0.1550 0.8424 0.8428
1.6757 19100 - 0.1493 0.8387 0.8384
1.6801 19150 - 0.1523 0.8391 0.8379
1.6845 19200 - 0.1512 0.8366 0.8343
1.6889 19250 0.2401 0.1506 0.8372 0.8348
1.6933 19300 - 0.1457 0.8375 0.8343
1.6977 19350 - 0.1500 0.8403 0.8379
1.7021 19400 - 0.1464 0.8380 0.8367
1.7064 19450 - 0.1485 0.8403 0.8397
1.7108 19500 0.2329 0.1469 0.8450 0.8417
1.7152 19550 - 0.1498 0.8418 0.8391
1.7196 19600 - 0.1427 0.8394 0.8384
1.7240 19650 - 0.1493 0.8399 0.8392
1.7284 19700 - 0.1487 0.8423 0.8406
1.7328 19750 0.2397 0.1464 0.8420 0.8398
1.7371 19800 - 0.1511 0.8433 0.8406
1.7415 19850 - 0.1502 0.8391 0.8365
1.7459 19900 - 0.1527 0.8404 0.8386
1.7503 19950 - 0.1498 0.8397 0.8390
1.7547 20000 0.2312 0.1505 0.8413 0.8389
1.7591 20050 - 0.1525 0.8411 0.8396
1.7635 20100 - 0.1491 0.8380 0.8370
1.7679 20150 - 0.1431 0.8395 0.8382
1.7722 20200 - 0.1451 0.8365 0.8352
1.7766 20250 0.2319 0.1485 0.8388 0.8366
1.7810 20300 - 0.1499 0.8376 0.8367
1.7854 20350 - 0.1448 0.8364 0.8349
1.7898 20400 - 0.1485 0.8346 0.8328
1.7942 20450 - 0.1470 0.8376 0.8364
1.7986 20500 0.2295 0.1471 0.8386 0.8363
1.8029 20550 - 0.1501 0.8351 0.8329
1.8073 20600 - 0.1494 0.8382 0.8364
1.8117 20650 - 0.1489 0.8405 0.8386
1.8161 20700 - 0.1465 0.8381 0.8372
1.8205 20750 0.2408 0.1435 0.8398 0.8390
1.8249 20800 - 0.1498 0.8449 0.8431
1.8293 20850 - 0.1487 0.8431 0.8416
1.8337 20900 - 0.1456 0.8419 0.8394
1.8380 20950 - 0.1437 0.8423 0.8408
1.8424 21000 0.2374 0.1408 0.8425 0.8414
1.8468 21050 - 0.1434 0.8434 0.8418
1.8512 21100 - 0.1486 0.8422 0.8403
1.8556 21150 - 0.1467 0.8429 0.8421
1.8600 21200 - 0.1458 0.8409 0.8402
1.8644 21250 0.2385 0.1449 0.8411 0.8395
1.8687 21300 - 0.1415 0.8401 0.8390
1.8731 21350 - 0.1462 0.8417 0.8403
1.8775 21400 - 0.1468 0.8423 0.8403
1.8819 21450 - 0.1459 0.8417 0.8394
1.8863 21500 0.2302 0.1466 0.8396 0.8372
1.8907 21550 - 0.1479 0.8391 0.8363
1.8951 21600 - 0.1407 0.8382 0.8365
1.8995 21650 - 0.1462 0.8377 0.8355
1.9038 21700 - 0.1438 0.8348 0.8343
1.9082 21750 0.2383 0.1451 0.8371 0.8363
1.9126 21800 - 0.1448 0.8375 0.8360
1.9170 21850 - 0.1389 0.8383 0.8377
1.9214 21900 - 0.1409 0.8379 0.8367
1.9258 21950 - 0.1397 0.8374 0.8352
1.9302 22000 0.2321 0.1408 0.8405 0.8385
1.9345 22050 - 0.1451 0.8381 0.8363
1.9389 22100 - 0.1467 0.8363 0.8353
1.9433 22150 - 0.1459 0.8352 0.8337
1.9477 22200 - 0.1431 0.8382 0.8355
1.9521 22250 0.2282 0.1457 0.8385 0.8371
1.9565 22300 - 0.1475 0.8364 0.8359
1.9609 22350 - 0.1483 0.8370 0.8336
1.9653 22400 - 0.1469 0.8406 0.8373
1.9696 22450 - 0.1430 0.8415 0.8391
1.9740 22500 0.2294 0.1471 0.8417 0.8399
1.9784 22550 - 0.1467 0.8414 0.8413
1.9828 22600 - 0.1464 0.8423 0.8410
1.9872 22650 - 0.1475 0.8431 0.8432
1.9916 22700 - 0.1476 0.8450 0.8442
1.9960 22750 0.2242 0.1463 0.8443 0.8418
2.0004 22800 - 0.1472 0.8422 0.8412
2.0047 22850 - 0.1506 0.8452 0.8435
2.0091 22900 - 0.1478 0.8463 0.8432
2.0135 22950 - 0.1536 0.8479 0.8454
2.0179 23000 0.2249 0.1487 0.8453 0.8422
2.0223 23050 - 0.1484 0.8430 0.8410
2.0267 23100 - 0.1524 0.8454 0.8440
2.0311 23150 - 0.1475 0.8450 0.8422
2.0354 23200 - 0.1533 0.8460 0.8435
2.0398 23250 0.2165 0.1551 0.8428 0.8410
2.0442 23300 - 0.1507 0.8425 0.8400
2.0486 23350 - 0.1517 0.8427 0.8410
2.0530 23400 - 0.1524 0.8404 0.8391
2.0574 23450 - 0.1515 0.8415 0.8408
2.0618 23500 0.2258 0.1500 0.8392 0.8384
2.0662 23550 - 0.1461 0.8387 0.8362
2.0705 23600 - 0.1429 0.8408 0.8378
2.0749 23650 - 0.1473 0.8410 0.8398
2.0793 23700 - 0.1474 0.8415 0.8402
2.0837 23750 0.2309 0.1479 0.8425 0.8408
2.0881 23800 - 0.1493 0.8427 0.8390
2.0925 23850 - 0.1469 0.8419 0.8394
2.0969 23900 - 0.1460 0.8426 0.8406
2.1012 23950 - 0.1502 0.8433 0.8418
2.1056 24000 0.2113 0.1462 0.8423 0.8406
2.1100 24050 - 0.1463 0.8429 0.8398
2.1144 24100 - 0.1459 0.8431 0.8400
2.1188 24150 - 0.1417 0.8403 0.8381
2.1232 24200 - 0.1396 0.8376 0.8371
2.1276 24250 0.2132 0.1419 0.8382 0.8380
2.1320 24300 - 0.1444 0.8378 0.8377
2.1363 24350 - 0.1399 0.8334 0.8342
2.1407 24400 - 0.1363 0.8382 0.8361
2.1451 24450 - 0.1379 0.8381 0.8369
2.1495 24500 0.2124 0.1421 0.8403 0.8391
2.1539 24550 - 0.1445 0.8399 0.8391
2.1583 24600 - 0.1452 0.8416 0.8401
2.1627 24650 - 0.1426 0.8411 0.8385
2.1670 24700 - 0.1447 0.8424 0.8407
2.1714 24750 0.2058 0.1460 0.8422 0.8413
2.1758 24800 - 0.1434 0.8422 0.8418
2.1802 24850 - 0.1443 0.8438 0.8416
2.1846 24900 - 0.1414 0.8422 0.8405
2.1890 24950 - 0.1437 0.8424 0.8407
2.1934 25000 0.2111 0.1466 0.8401 0.8394
2.1978 25050 - 0.1437 0.8390 0.8377
2.2021 25100 - 0.1446 0.8402 0.8394
2.2065 25150 - 0.1457 0.8394 0.8380
2.2109 25200 - 0.1432 0.8406 0.8380
2.2153 25250 0.2013 0.1464 0.8412 0.8397
2.2197 25300 - 0.1499 0.8419 0.8388
2.2241 25350 - 0.1466 0.8425 0.8402
2.2285 25400 - 0.1429 0.8424 0.8397
2.2328 25450 - 0.1433 0.8430 0.8404
2.2372 25500 0.2064 0.1472 0.8410 0.8404
2.2416 25550 - 0.1451 0.8406 0.8386
2.2460 25600 - 0.1480 0.8427 0.8419
2.2504 25650 - 0.1507 0.8409 0.8412
2.2548 25700 - 0.1488 0.8407 0.8398
2.2592 25750 0.2084 0.1476 0.8401 0.8392
2.2636 25800 - 0.1478 0.8403 0.8388
2.2679 25850 - 0.1509 0.8420 0.8417
2.2723 25900 - 0.1464 0.8417 0.8396
2.2767 25950 - 0.1469 0.8406 0.8388
2.2811 26000 0.2113 0.1470 0.8422 0.8404
2.2855 26050 - 0.1479 0.8414 0.8411
2.2899 26100 - 0.1488 0.8424 0.8418
2.2943 26150 - 0.1508 0.8429 0.8428
2.2986 26200 - 0.1507 0.8425 0.8422
2.3030 26250 0.2045 0.1496 0.8423 0.8416

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.3.1
  • Accelerate: 1.0.1
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}