jvanhoof's picture
Upload folder using huggingface_hub
abd581b verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:21210762
  - loss:MSELoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: >-
      A Message Authentication Code is a protection against data being altered
      in transit by an attacker who has the ability to read the data in
      real-time.
    sentences:
      - 彼との仕事はめちゃくちゃクールだった。
      - もう何十台も通り過ぎた。
      - メッセージ認証コードは、データをリアルタイムで読み取る能力を持つ攻撃者による転送中のデータの改ざんから保護します。
  - source_sentence: >-
      “We have the best entertainers from New York, Hollywood and Las Vegas
      perform here.”
    sentences:
      - 「ニューヨークやハリウッド、ラスベガスからの素晴らしい芸人たちがここでショーをするんです」
      - 現在は様々なサプリが販売されています。
      -  トルコの友人たちへの手紙
  - source_sentence: A correction was made on November 24th.
    sentences:
      - 私たちが抱え込んでいたトラウマ。
      - 11月24日に訂正いたしました。
      - なぜ憲法を学ぶのでしょうか。
  - source_sentence: We need more teachers like him nowadays.
    sentences:
      - そういうことが、いまの教員には、もっと必要でしょうね。
      - >-
        This may include, but is not limited to, investigating and intercepting
        payments into and out of your account(s) (particularly in the case of
        international transfers of funds) and investigating the source of or
        intended recipient of funds.
        ⇒ 少なくとも(違法行為が疑われる)名義人の口座の出入金(特に国際的な送金)について調査し、かつそれらを差し止めたうえで、入金についてはその送金元、出金についてはその受取り人を調査します。
      - 私がそこで授業をしているのである。
  - source_sentence: >-
      Nano-structure A nanostructure is an intermediate size between molecular
      and microscopic (micrometer-sized) structures.
    sentences:
      - 魔術のお話に戻りましょう。
      - 引き続きワシントンより。
      - ナノ構造は、分子構造と微視的(マイクロメートルサイズ)構造との間の中間サイズの対象である。
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: JSTS
          type: JSTS
        metrics:
          - type: pearson_cosine
            value: 0.823463331969533
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.7815308480362135
            name: Spearman Cosine
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: stsb multi mt en
          type: stsb_multi_mt-en
        metrics:
          - type: pearson_cosine
            value: 0.8362828278686943
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8564038929573722
            name: Spearman Cosine

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Nano-structure A nanostructure is an intermediate size between molecular and microscopic (micrometer-sized) structures.',
    'ナノ構造は、分子構造と微視的(マイクロメートルサイズ)構造との間の中間サイズの対象である。',
    '魔術のお話に戻りましょう。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric JSTS stsb_multi_mt-en
pearson_cosine 0.8235 0.8363
spearman_cosine 0.7815 0.8564

Training Details

Training Dataset

Unnamed Dataset

  • Size: 21,210,762 training samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 16.46 tokens
    • max: 92 tokens
    • min: 4 tokens
    • mean: 21.99 tokens
    • max: 128 tokens
    • size: 384 elements
  • Samples:
    english non_english label
    We live the life of the project. プロジェクトの生命を左右する。 [0.009600206278264523, 0.058811139315366745, 0.023707984015345573, -0.021880649030208588, 0.068634033203125, ...]
    Hold on here, Mr. Budget Director. ここいろ編集長 [-0.04940887540578842, -0.013437069952487946, 0.024199623614549637, -0.02371774986386299, 0.06858911365270615, ...]
    So yes, biology has all the attributes of a transportation genius today. そうです 生物は 今日話した最高の交通にある特性を 全て持ち合わせています [0.031787291169166565, 0.011292539536952972, 0.03621761128306389, -0.04237872734665871, -0.030112963169813156, ...]
  • Loss: MSELoss

Evaluation Dataset

Unnamed Dataset

  • Size: 214,251 evaluation samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 16.24 tokens
    • max: 88 tokens
    • min: 4 tokens
    • mean: 22.18 tokens
    • max: 128 tokens
    • size: 384 elements
  • Samples:
    english non_english label
    Then the next step was the social bookmarking. 次のカテゴリはソーシャルブックマークです。 [-0.040418993681669235, 0.019537044689059258, -0.014964035712182522, -0.06385297328233719, 0.00023657231940887868, ...]
    Ooh! Scary word! Ahh! なんと 恐ろしい言葉! [0.023886308073997498, -0.04336044192314148, -0.057255394756793976, 0.05142980441451073, 0.06282227486371994, ...]
    Usually Ebates offers 1. 通常提示スプレッド*1 [0.00018616259330883622, -0.01999301090836525, 0.049356017261743546, 0.002617522142827511, -0.0540102981030941, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • gradient_accumulation_steps: 2
  • learning_rate: 0.0003
  • num_train_epochs: 8
  • warmup_ratio: 0.15
  • bf16: True
  • dataloader_num_workers: 8

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0003
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 8
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss JSTS_spearman_cosine stsb_multi_mt-en_spearman_cosine
0.0241 500 0.0064 - - -
0.0483 1000 0.0045 - - -
0.0724 1500 0.0038 - - -
0.0966 2000 0.0035 0.0016 0.3008 0.2821
0.1207 2500 0.0033 - - -
0.1448 3000 0.0031 - - -
0.1690 3500 0.0029 - - -
0.1931 4000 0.0028 0.0013 0.4989 0.4681
0.2172 4500 0.0026 - - -
0.2414 5000 0.0025 - - -
0.2655 5500 0.0023 - - -
0.2897 6000 0.0022 0.0010 0.6554 0.6567
0.3138 6500 0.0021 - - -
0.3379 7000 0.002 - - -
0.3621 7500 0.0019 - - -
0.3862 8000 0.0018 0.0008 0.7038 0.7328
0.4104 8500 0.0017 - - -
0.4345 9000 0.0017 - - -
0.4586 9500 0.0016 - - -
0.4828 10000 0.0016 0.0007 0.7420 0.7662
0.5069 10500 0.0015 - - -
0.5310 11000 0.0015 - - -
0.5552 11500 0.0014 - - -
0.5793 12000 0.0014 0.0006 0.7559 0.7929
0.6035 12500 0.0014 - - -
0.6276 13000 0.0014 - - -
0.6517 13500 0.0013 - - -
0.6759 14000 0.0013 0.0006 0.7625 0.8056
0.7000 14500 0.0013 - - -
0.7241 15000 0.0013 - - -
0.7483 15500 0.0012 - - -
0.7724 16000 0.0012 0.0006 0.7652 0.8150
0.7966 16500 0.0012 - - -
0.8207 17000 0.0012 - - -
0.8448 17500 0.0012 - - -
0.8690 18000 0.0012 0.0005 0.7679 0.8209
0.8931 18500 0.0012 - - -
0.9173 19000 0.0011 - - -
0.9414 19500 0.0011 - - -
0.9655 20000 0.0011 0.0005 0.7727 0.8269
0.9897 20500 0.0011 - - -
1.0138 21000 0.0011 - - -
1.0379 21500 0.0011 - - -
1.0621 22000 0.0011 0.0005 0.7682 0.8319
1.0862 22500 0.0011 - - -
1.1104 23000 0.0011 - - -
1.1345 23500 0.0011 - - -
1.1586 24000 0.0011 0.0005 0.7718 0.8372
1.1828 24500 0.0011 - - -
1.2069 25000 0.0011 - - -
1.2311 25500 0.0011 - - -
1.2552 26000 0.001 0.0005 0.7751 0.8408
1.2793 26500 0.001 - - -
1.3035 27000 0.001 - - -
1.3276 27500 0.001 - - -
1.3517 28000 0.001 0.0005 0.7703 0.8437
1.3759 28500 0.001 - - -
1.4000 29000 0.001 - - -
1.4242 29500 0.001 - - -
1.4483 30000 0.001 0.0005 0.7730 0.8439
1.4724 30500 0.001 - - -
1.4966 31000 0.001 - - -
1.5207 31500 0.001 - - -
1.5448 32000 0.001 0.0005 0.7719 0.8456
1.5690 32500 0.001 - - -
1.5931 33000 0.001 - - -
1.6173 33500 0.001 - - -
1.6414 34000 0.001 0.0005 0.7719 0.8449
1.6655 34500 0.001 - - -
1.6897 35000 0.001 - - -
1.7138 35500 0.001 - - -
1.7380 36000 0.001 0.0004 0.7717 0.8455
1.7621 36500 0.001 - - -
1.7862 37000 0.001 - - -
1.8104 37500 0.001 - - -
1.8345 38000 0.001 0.0004 0.7714 0.8488
1.8586 38500 0.001 - - -
1.8828 39000 0.001 - - -
1.9069 39500 0.001 - - -
1.9311 40000 0.001 0.0004 0.7753 0.8474
1.9552 40500 0.001 - - -
1.9793 41000 0.001 - - -
2.0035 41500 0.001 - - -
2.0276 42000 0.001 0.0004 0.7708 0.8479
2.0518 42500 0.001 - - -
2.0759 43000 0.001 - - -
2.1000 43500 0.001 - - -
2.1242 44000 0.001 0.0004 0.7703 0.8505
2.1483 44500 0.001 - - -
2.1724 45000 0.0009 - - -
2.1966 45500 0.0009 - - -
2.2207 46000 0.0009 0.0004 0.7752 0.8525
2.2449 46500 0.0009 - - -
2.2690 47000 0.0009 - - -
2.2931 47500 0.0009 - - -
2.3173 48000 0.0009 0.0004 0.7734 0.8518
2.3414 48500 0.0009 - - -
2.3655 49000 0.0009 - - -
2.3897 49500 0.0009 - - -
2.4138 50000 0.0009 0.0004 0.7725 0.8512
2.4380 50500 0.0009 - - -
2.4621 51000 0.0009 - - -
2.4862 51500 0.0009 - - -
2.5104 52000 0.0009 0.0004 0.7709 0.8535
2.5345 52500 0.0009 - - -
2.5587 53000 0.0009 - - -
2.5828 53500 0.0009 - - -
2.6069 54000 0.0009 0.0004 0.7751 0.8519
2.6311 54500 0.0009 - - -
2.6552 55000 0.0009 - - -
2.6793 55500 0.0009 - - -
2.7035 56000 0.0009 0.0004 0.7770 0.8500
2.7276 56500 0.0009 - - -
2.7518 57000 0.0009 - - -
2.7759 57500 0.0009 - - -
2.8000 58000 0.0009 0.0004 0.7756 0.8514
2.8242 58500 0.0009 - - -
2.8483 59000 0.0009 - - -
2.8725 59500 0.0009 - - -
2.8966 60000 0.0009 0.0004 0.7791 0.8541
2.9207 60500 0.0009 - - -
2.9449 61000 0.0009 - - -
2.9690 61500 0.0009 - - -
2.9931 62000 0.0009 0.0004 0.7759 0.8539
3.0173 62500 0.0009 - - -
3.0414 63000 0.0009 - - -
3.0656 63500 0.0009 - - -
3.0897 64000 0.0009 0.0004 0.7770 0.8526
3.1138 64500 0.0009 - - -
3.1380 65000 0.0009 - - -
3.1621 65500 0.0009 - - -
3.1863 66000 0.0009 0.0004 0.7762 0.8531
3.2104 66500 0.0009 - - -
3.2345 67000 0.0009 - - -
3.2587 67500 0.0009 - - -
3.2828 68000 0.0009 0.0004 0.7771 0.8515
3.3069 68500 0.0009 - - -
3.3311 69000 0.0009 - - -
3.3552 69500 0.0009 - - -
3.3794 70000 0.0009 0.0004 0.7757 0.8530
3.4035 70500 0.0009 - - -
3.4276 71000 0.0009 - - -
3.4518 71500 0.0009 - - -
3.4759 72000 0.0009 0.0004 0.7776 0.8532
3.5000 72500 0.0009 - - -
3.5242 73000 0.0009 - - -
3.5483 73500 0.0009 - - -
3.5725 74000 0.0009 0.0004 0.7776 0.8542
3.5966 74500 0.0009 - - -
3.6207 75000 0.0009 - - -
3.6449 75500 0.0009 - - -
3.6690 76000 0.0009 0.0004 0.7803 0.8539
3.6932 76500 0.0009 - - -
3.7173 77000 0.0009 - - -
3.7414 77500 0.0009 - - -
3.7656 78000 0.0009 0.0004 0.7778 0.8537
3.7897 78500 0.0009 - - -
3.8138 79000 0.0009 - - -
3.8380 79500 0.0009 - - -
3.8621 80000 0.0009 0.0004 0.7800 0.8539
3.8863 80500 0.0009 - - -
3.9104 81000 0.0009 - - -
3.9345 81500 0.0009 - - -
3.9587 82000 0.0009 0.0004 0.7797 0.8542
3.9828 82500 0.0009 - - -
4.0070 83000 0.0009 - - -
4.0311 83500 0.0009 - - -
4.0552 84000 0.0009 0.0004 0.7808 0.8547
4.0794 84500 0.0009 - - -
4.1035 85000 0.0009 - - -
4.1276 85500 0.0009 - - -
4.1518 86000 0.0009 0.0004 0.7778 0.8545
4.1759 86500 0.0009 - - -
4.2001 87000 0.0009 - - -
4.2242 87500 0.0009 - - -
4.2483 88000 0.0009 0.0004 0.7815 0.8555
4.2725 88500 0.0009 - - -
4.2966 89000 0.0009 - - -
4.3207 89500 0.0009 - - -
4.3449 90000 0.0009 0.0004 0.7797 0.8534
4.3690 90500 0.0009 - - -
4.3932 91000 0.0009 - - -
4.4173 91500 0.0009 - - -
4.4414 92000 0.0009 0.0004 0.7823 0.8547
4.4656 92500 0.0009 - - -
4.4897 93000 0.0009 - - -
4.5139 93500 0.0009 - - -
4.5380 94000 0.0009 0.0004 0.7783 0.8535
4.5621 94500 0.0009 - - -
4.5863 95000 0.0009 - - -
4.6104 95500 0.0009 - - -
4.6345 96000 0.0009 0.0004 0.7811 0.8550
4.6587 96500 0.0009 - - -
4.6828 97000 0.0009 - - -
4.7070 97500 0.0009 - - -
4.7311 98000 0.0009 0.0004 0.7801 0.8540
4.7552 98500 0.0009 - - -
4.7794 99000 0.0009 - - -
4.8035 99500 0.0009 - - -
4.8277 100000 0.0009 0.0004 0.7811 0.8544
4.8518 100500 0.0009 - - -
4.8759 101000 0.0009 - - -
4.9001 101500 0.0009 - - -
4.9242 102000 0.0009 0.0004 0.7805 0.8548
4.9483 102500 0.0009 - - -
4.9725 103000 0.0009 - - -
4.9966 103500 0.0009 - - -
5.0208 104000 0.0009 0.0004 0.7797 0.8534
5.0449 104500 0.0009 - - -
5.0690 105000 0.0009 - - -
5.0932 105500 0.0009 - - -
5.1173 106000 0.0009 0.0004 0.7821 0.8555
5.1415 106500 0.0009 - - -
5.1656 107000 0.0009 - - -
5.1897 107500 0.0009 - - -
5.2139 108000 0.0009 0.0004 0.7816 0.8558
5.2380 108500 0.0009 - - -
5.2621 109000 0.0009 - - -
5.2863 109500 0.0009 - - -
5.3104 110000 0.0009 0.0004 0.7804 0.8556
5.3346 110500 0.0009 - - -
5.3587 111000 0.0009 - - -
5.3828 111500 0.0009 - - -
5.4070 112000 0.0009 0.0004 0.7813 0.8548
5.4311 112500 0.0009 - - -
5.4552 113000 0.0009 - - -
5.4794 113500 0.0009 - - -
5.5035 114000 0.0009 0.0004 0.7823 0.8548
5.5277 114500 0.0009 - - -
5.5518 115000 0.0009 - - -
5.5759 115500 0.0009 - - -
5.6001 116000 0.0009 0.0004 0.7809 0.8551
5.6242 116500 0.0009 - - -
5.6484 117000 0.0009 - - -
5.6725 117500 0.0009 - - -
5.6966 118000 0.0009 0.0004 0.7833 0.8557
5.7208 118500 0.0009 - - -
5.7449 119000 0.0009 - - -
5.7690 119500 0.0009 - - -
5.7932 120000 0.0009 0.0004 0.7842 0.8551
5.8173 120500 0.0009 - - -
5.8415 121000 0.0009 - - -
5.8656 121500 0.0009 - - -
5.8897 122000 0.0009 0.0004 0.7817 0.8563
5.9139 122500 0.0009 - - -
5.9380 123000 0.0009 - - -
5.9622 123500 0.0009 - - -
5.9863 124000 0.0009 0.0004 0.7812 0.8559
6.0104 124500 0.0009 - - -
6.0346 125000 0.0009 - - -
6.0587 125500 0.0009 - - -
6.0828 126000 0.0009 0.0004 0.7821 0.8558
6.1070 126500 0.0009 - - -
6.1311 127000 0.0009 - - -
6.1553 127500 0.0009 - - -
6.1794 128000 0.0009 0.0004 0.7829 0.8548
6.2035 128500 0.0009 - - -
6.2277 129000 0.0009 - - -
6.2518 129500 0.0009 - - -
6.2759 130000 0.0009 0.0004 0.7805 0.8549
6.3001 130500 0.0009 - - -
6.3242 131000 0.0009 - - -
6.3484 131500 0.0009 - - -
6.3725 132000 0.0009 0.0004 0.7807 0.8563
6.3966 132500 0.0009 - - -
6.4208 133000 0.0009 - - -
6.4449 133500 0.0009 - - -
6.4691 134000 0.0009 0.0004 0.7829 0.8555
6.4932 134500 0.0009 - - -
6.5173 135000 0.0009 - - -
6.5415 135500 0.0009 - - -
6.5656 136000 0.0009 0.0004 0.7819 0.8550
6.5897 136500 0.0009 - - -
6.6139 137000 0.0009 - - -
6.6380 137500 0.0009 - - -
6.6622 138000 0.0009 0.0004 0.7800 0.8548
6.6863 138500 0.0009 - - -
6.7104 139000 0.0009 - - -
6.7346 139500 0.0009 - - -
6.7587 140000 0.0009 0.0004 0.7817 0.8555
6.7829 140500 0.0009 - - -
6.8070 141000 0.0009 - - -
6.8311 141500 0.0009 - - -
6.8553 142000 0.0009 0.0004 0.7812 0.8556
6.8794 142500 0.0009 - - -
6.9035 143000 0.0009 - - -
6.9277 143500 0.0009 - - -
6.9518 144000 0.0009 0.0004 0.7830 0.8559
6.9760 144500 0.0009 - - -
7.0001 145000 0.0009 - - -
7.0242 145500 0.0009 - - -
7.0484 146000 0.0009 0.0004 0.7809 0.8561
7.0725 146500 0.0009 - - -
7.0966 147000 0.0009 - - -
7.1208 147500 0.0009 - - -
7.1449 148000 0.0009 0.0004 0.7798 0.8560
7.1691 148500 0.0009 - - -
7.1932 149000 0.0009 - - -
7.2173 149500 0.0009 - - -
7.2415 150000 0.0009 0.0004 0.7815 0.8559
7.2656 150500 0.0009 - - -
7.2898 151000 0.0009 - - -
7.3139 151500 0.0009 - - -
7.3380 152000 0.0009 0.0004 0.7828 0.8562
7.3622 152500 0.0009 - - -
7.3863 153000 0.0009 - - -
7.4104 153500 0.0009 - - -
7.4346 154000 0.0009 0.0004 0.7837 0.8565
7.4587 154500 0.0009 - - -
7.4829 155000 0.0009 - - -
7.5070 155500 0.0009 - - -
7.5311 156000 0.0009 0.0004 0.7819 0.8565
7.5553 156500 0.0009 - - -
7.5794 157000 0.0009 - - -
7.6036 157500 0.0009 - - -
7.6277 158000 0.0009 0.0004 0.7818 0.8557
7.6518 158500 0.0009 - - -
7.6760 159000 0.0009 - - -
7.7001 159500 0.0009 - - -
7.7242 160000 0.0009 0.0004 0.7811 0.8557
7.7484 160500 0.0009 - - -
7.7725 161000 0.0009 - - -
7.7967 161500 0.0009 - - -
7.8208 162000 0.0009 0.0004 0.7821 0.8566
7.8449 162500 0.0009 - - -
7.8691 163000 0.0009 - - -
7.8932 163500 0.0009 - - -
7.9174 164000 0.0009 0.0004 0.7815 0.8564
7.9415 164500 0.0009 - - -
7.9656 165000 0.0009 - - -
7.9898 165500 0.0009 - - -

Framework Versions

  • Python: 3.10.16
  • Sentence Transformers: 3.3.1
  • Transformers: 4.51.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}