ayushexel's picture
Add new SentenceTransformer model
0892955 verified
metadata
tags:
  - ColBERT
  - PyLate
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:9461702
  - loss:Contrastive
base_model: answerdotai/ModernBERT-base
pipeline_tag: sentence-similarity
library_name: PyLate
metrics:
  - accuracy
model-index:
  - name: PyLate model based on answerdotai/ModernBERT-base
    results:
      - task:
          type: col-berttriplet
          name: Col BERTTriplet
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: accuracy
            value: 0.5281999707221985
            name: Accuracy

PyLate model based on answerdotai/ModernBERT-base

This is a PyLate model finetuned from answerdotai/ModernBERT-base. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

Model Details

Model Description

  • Model Type: PyLate model
  • Base model: answerdotai/ModernBERT-base
  • Document Length: 180 tokens
  • Query Length: 32 tokens
  • Output Dimensionality: 128 tokens
  • Similarity Function: MaxSim

Model Sources

Full Model Architecture

ColBERT(
  (0): Transformer({'max_seq_length': 179, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

First install the PyLate library:

pip install -U pylate

Retrieval

PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval.

Indexing documents

First, load the ColBERT model and initialize the Voyager index, then encode and index your documents:

from pylate import indexes, models, retrieve

# Step 1: Load the ColBERT model
model = models.ColBERT(
    model_name_or_path=ayushexel/colbert-ModernBERT-base-5-neg-1-epoch-gooaq-1995000,
)

# Step 2: Initialize the Voyager index
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
    override=True,  # This overwrites the existing index if any
)

# Step 3: Encode the documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]

documents_embeddings = model.encode(
    documents,
    batch_size=32,
    is_query=False,  # Ensure that it is set to False to indicate that these are documents, not queries
    show_progress_bar=True,
)

# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
    documents_ids=documents_ids,
    documents_embeddings=documents_embeddings,
)

Note that you do not have to recreate the index and encode the documents every time. Once you have created an index and added the documents, you can re-use the index later by loading it:

# To load an index, simply instantiate it with the correct folder/name and without overriding it
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
)

Retrieving top-k documents for queries

Once the documents are indexed, you can retrieve the top-k most relevant documents for a given set of queries. To do so, initialize the ColBERT retriever with the index you want to search in, encode the queries and then retrieve the top-k documents to get the top matches ids and relevance scores:

# Step 1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)

# Step 2: Encode the queries
queries_embeddings = model.encode(
    ["query for document 3", "query for document 1"],
    batch_size=32,
    is_query=True,  #  # Ensure that it is set to False to indicate that these are queries
    show_progress_bar=True,
)

# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
    queries_embeddings=queries_embeddings,
    k=10,  # Retrieve the top 10 matches for each query
)

Reranking

If you only want to use the ColBERT model to perform reranking on top of your first-stage retrieval pipeline without building an index, you can simply use rank function and pass the queries and documents to rerank:

from pylate import rank, models

queries = [
    "query A",
    "query B",
]

documents = [
    ["document A", "document B"],
    ["document 1", "document C", "document B"],
]

documents_ids = [
    [1, 2],
    [1, 3, 2],
]

model = models.ColBERT(
    model_name_or_path=ayushexel/colbert-ModernBERT-base-5-neg-1-epoch-gooaq-1995000,
)

queries_embeddings = model.encode(
    queries,
    is_query=True,
)

documents_embeddings = model.encode(
    documents,
    is_query=False,
)

reranked_documents = rank.rerank(
    documents_ids=documents_ids,
    queries_embeddings=queries_embeddings,
    documents_embeddings=documents_embeddings,
)

Evaluation

Metrics

Col BERTTriplet

  • Evaluated with pylate.evaluation.colbert_triplet.ColBERTTripletEvaluator
Metric Value
accuracy 0.5282

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,461,702 training samples
  • Columns: question, answer, and negative
  • Approximate statistics based on the first 1000 samples:
    question answer negative
    type string string string
    details
    • min: 9 tokens
    • mean: 13.05 tokens
    • max: 19 tokens
    • min: 25 tokens
    • mean: 31.88 tokens
    • max: 32 tokens
    • min: 16 tokens
    • mean: 31.67 tokens
    • max: 32 tokens
  • Samples:
    question answer negative
    what is the maximum income you can make while collecting social security? The Social Security earnings limit is $1,470 per month or $17,640 per year in 2019 for someone age 65 or younger. If you earn more than this amount, you can expect to have $1 withheld from your Social Security benefit for every $2 earned above the limit. Once you reach FRA, there is no cap on how much you can earn and still receive your full Social Security benefit. The earnings limits are adjusted annually for national wage trends. In 2020, you lose $1 in benefits for every $2 earned over $18,240.
    what is the maximum income you can make while collecting social security? The Social Security earnings limit is $1,470 per month or $17,640 per year in 2019 for someone age 65 or younger. If you earn more than this amount, you can expect to have $1 withheld from your Social Security benefit for every $2 earned above the limit. You can get Social Security retirement or survivors benefits and work at the same time. However, there is a limit to how much you can earn and still receive full benefits. If you are younger than full retirement age and earn more than the yearly earnings limit, we may reduce your benefit amount.
    what is the maximum income you can make while collecting social security? The Social Security earnings limit is $1,470 per month or $17,640 per year in 2019 for someone age 65 or younger. If you earn more than this amount, you can expect to have $1 withheld from your Social Security benefit for every $2 earned above the limit. If you haven't yet reached full retirement age, you can earn up to $17,640 in income each year without any reduction in benefits. But for each $2 you earn above this limit, the Social Security Administration deducts $1 from your benefit payments. Under full retirement age for part of a year.
  • Loss: pylate.losses.contrastive.Contrastive

Evaluation Dataset

Unnamed Dataset

  • Size: 5,000 evaluation samples
  • Columns: question, answer, and negative_1
  • Approximate statistics based on the first 1000 samples:
    question answer negative_1
    type string string string
    details
    • min: 9 tokens
    • mean: 12.93 tokens
    • max: 22 tokens
    • min: 17 tokens
    • mean: 31.7 tokens
    • max: 32 tokens
    • min: 14 tokens
    • mean: 31.4 tokens
    • max: 32 tokens
  • Samples:
    question answer negative_1
    are bird scooters in nyc? New York State is on the verge of embracing electric scooters and bicycles in a victory for tech leaders and delivery workers who have fought for months to make the speedy devices legal. ... There is just one catch — scooter rental companies like Bird and Lime cannot operate in Manhattan. New York State is on the verge of embracing electric scooters and bicycles in a victory for tech leaders and delivery workers who have fought for months to make the speedy devices legal. ... There is just one catch — scooter rental companies like Bird and Lime cannot operate in Manhattan.
    can you go into a bar if you're 18? You can enter a bar at 18 but you cannot consume alcoholic beverages until you are 21. ... Some states will make some exceptions for a parent allowing you to drink from their alcoholic beverage, but it is best to not do that in public places if you are under the age of 21 in the USA. 1. Re: How old do you have to be to enter a club, bar, pub? Generally 18 is fine, though some upscale bars may extend that to 21. Pubs don't have an age limit to enter, but you may get carded if ordering alcohol.
    how are blood pressure numbers written and recorded? Blood pressure is recorded as two numbers and written as a ratio: the top number, called the systolic pressure, is the pressure as the heart beats. The bottom number, called the diastolic pressure, is the measurement as the heart relaxes between beats. Blood pressure is recorded with 2 numbers. The systolic pressure (higher number) is the force at which your heart pumps blood around your body. The diastolic pressure (lower number) is the resistance to the blood flow in the blood vessels.
  • Loss: pylate.losses.contrastive.Contrastive

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 3e-06
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 12
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 12
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss accuracy
0 0 - - 0.4558
0.0000 1 17.2657 - -
0.0027 200 17.0631 - -
0.0054 400 11.2015 - -
0.0081 600 7.8228 - -
0.0108 800 6.0774 - -
0.0135 1000 5.3122 - -
0.0162 1200 4.3348 - -
0.0189 1400 2.6982 - -
0.0216 1600 1.7959 - -
0.0244 1800 1.3555 - -
0.0271 2000 1.1443 - -
0.0298 2200 1.0092 - -
0.0325 2400 0.9274 - -
0.0352 2600 0.8485 - -
0.0379 2800 0.7953 - -
0.0406 3000 0.7541 - -
0.0433 3200 0.7302 - -
0.0460 3400 0.6836 - -
0.0487 3600 0.6546 - -
0.0514 3800 0.6219 - -
0.0541 4000 0.6116 - -
0.0568 4200 0.5813 - -
0.0595 4400 0.5499 - -
0.0622 4600 0.5334 - -
0.0649 4800 0.5276 - -
0.0676 5000 0.4969 - -
0.0703 5200 0.4789 - -
0.0731 5400 0.4709 - -
0.0758 5600 0.4598 - -
0.0785 5800 0.4465 - -
0.0812 6000 0.4333 - -
0.0839 6200 0.4258 - -
0.0866 6400 0.4056 - -
0.0893 6600 0.3855 - -
0.0920 6800 0.3855 - -
0.0947 7000 0.3761 - -
0.0974 7200 0.369 - -
0.1001 7400 0.3531 - -
0.1028 7600 0.3549 - -
0.1055 7800 0.3342 - -
0.1082 8000 0.3289 - -
0.1109 8200 0.3231 - -
0.1136 8400 0.3197 - -
0.1163 8600 0.3066 - -
0.1190 8800 0.309 - -
0.1218 9000 0.2953 - -
0.1245 9200 0.284 - -
0.1272 9400 0.2841 - -
0.1299 9600 0.2842 - -
0.1326 9800 0.2764 - -
0.1353 10000 0.2737 - -
0.1380 10200 0.2673 - -
0.1407 10400 0.2556 - -
0.1434 10600 0.2613 - -
0.1461 10800 0.2559 - -
0.1488 11000 0.2557 - -
0.1515 11200 0.2496 - -
0.1542 11400 0.2411 - -
0.1569 11600 0.2446 - -
0.1596 11800 0.2384 - -
0.1623 12000 0.2267 - -
0.1650 12200 0.2401 - -
0.1677 12400 0.2338 - -
0.1705 12600 0.2306 - -
0.1732 12800 0.2259 - -
0.1759 13000 0.2278 - -
0.1786 13200 0.2172 - -
0.1813 13400 0.2254 - -
0.1840 13600 0.2232 - -
0.1867 13800 0.2106 - -
0.1894 14000 0.2187 - -
0.1921 14200 0.2147 - -
0.1948 14400 0.2043 - -
0.1975 14600 0.2017 - -
0.2002 14800 0.2071 - -
0.2029 15000 0.2016 - -
0.2056 15200 0.1994 - -
0.2083 15400 0.2018 - -
0.2110 15600 0.1946 - -
0.2137 15800 0.1911 - -
0.2165 16000 0.1828 - -
0.2192 16200 0.1878 - -
0.2219 16400 0.1839 - -
0.2246 16600 0.1939 - -
0.2273 16800 0.1842 - -
0.2300 17000 0.1912 - -
0.2327 17200 0.1851 - -
0.2354 17400 0.1863 - -
0.2381 17600 0.1829 - -
0.2408 17800 0.1829 - -
0.2435 18000 0.177 - -
0.2462 18200 0.1768 - -
0.2489 18400 0.1819 - -
0.2516 18600 0.1778 - -
0.2543 18800 0.1803 - -
0.2570 19000 0.1758 - -
0.2597 19200 0.1736 - -
0.2624 19400 0.1759 - -
0.2652 19600 0.1751 - -
0.2679 19800 0.1739 - -
0.2706 20000 0.1677 - -
0 0 - - 0.5018
0.2706 20000 - 1.0521 -
0.2733 20200 0.1681 - -
0.2760 20400 0.1672 - -
0.2787 20600 0.1695 - -
0.2814 20800 0.1696 - -
0.2841 21000 0.1662 - -
0.2868 21200 0.1612 - -
0.2895 21400 0.1678 - -
0.2922 21600 0.1617 - -
0.2949 21800 0.1635 - -
0.2976 22000 0.1622 - -
0.3003 22200 0.1647 - -
0.3030 22400 0.1634 - -
0.3057 22600 0.1597 - -
0.3084 22800 0.1616 - -
0.3111 23000 0.1538 - -
0.3139 23200 0.1601 - -
0.3166 23400 0.1583 - -
0.3193 23600 0.161 - -
0.3220 23800 0.1539 - -
0.3247 24000 0.1602 - -
0.3274 24200 0.1493 - -
0.3301 24400 0.1536 - -
0.3328 24600 0.1572 - -
0.3355 24800 0.1577 - -
0.3382 25000 0.1508 - -
0.3409 25200 0.1514 - -
0.3436 25400 0.1506 - -
0.3463 25600 0.1544 - -
0.3490 25800 0.1574 - -
0.3517 26000 0.1507 - -
0.3544 26200 0.1462 - -
0.3571 26400 0.1527 - -
0.3598 26600 0.1474 - -
0.3626 26800 0.1516 - -
0.3653 27000 0.1447 - -
0.3680 27200 0.1484 - -
0.3707 27400 0.1454 - -
0.3734 27600 0.1467 - -
0.3761 27800 0.1517 - -
0.3788 28000 0.1505 - -
0.3815 28200 0.1395 - -
0.3842 28400 0.145 - -
0.3869 28600 0.143 - -
0.3896 28800 0.1417 - -
0.3923 29000 0.142 - -
0.3950 29200 0.1401 - -
0.3977 29400 0.1399 - -
0.4004 29600 0.1437 - -
0.4031 29800 0.1399 - -
0.4058 30000 0.1394 - -
0.4085 30200 0.1373 - -
0.4113 30400 0.1388 - -
0.4140 30600 0.1384 - -
0.4167 30800 0.1434 - -
0.4194 31000 0.1398 - -
0.4221 31200 0.1476 - -
0.4248 31400 0.1387 - -
0.4275 31600 0.1346 - -
0.4302 31800 0.137 - -
0.4329 32000 0.135 - -
0.4356 32200 0.1363 - -
0.4383 32400 0.1336 - -
0.4410 32600 0.1323 - -
0.4437 32800 0.1371 - -
0.4464 33000 0.1305 - -
0.4491 33200 0.1315 - -
0.4518 33400 0.1366 - -
0.4545 33600 0.1336 - -
0.4573 33800 0.1349 - -
0.4600 34000 0.1338 - -
0.4627 34200 0.1388 - -
0.4654 34400 0.1312 - -
0.4681 34600 0.1299 - -
0.4708 34800 0.1325 - -
0.4735 35000 0.1277 - -
0.4762 35200 0.132 - -
0.4789 35400 0.1322 - -
0.4816 35600 0.1286 - -
0.4843 35800 0.1322 - -
0.4870 36000 0.1342 - -
0.4897 36200 0.1306 - -
0.4924 36400 0.1339 - -
0.4951 36600 0.1327 - -
0.4978 36800 0.129 - -
0.5005 37000 0.1301 - -
0.5032 37200 0.1266 - -
0.5060 37400 0.1295 - -
0.5087 37600 0.1263 - -
0.5114 37800 0.1321 - -
0.5141 38000 0.1213 - -
0.5168 38200 0.1253 - -
0.5195 38400 0.13 - -
0.5222 38600 0.1234 - -
0.5249 38800 0.1259 - -
0.5276 39000 0.1303 - -
0.5303 39200 0.1268 - -
0.5330 39400 0.1229 - -
0.5357 39600 0.1291 - -
0.5384 39800 0.1257 - -
0.5411 40000 0.1249 - -
0 0 - - 0.5130
0.5411 40000 - 1.0519 -
0.5438 40200 0.1259 - -
0.5465 40400 0.1253 - -
0.5492 40600 0.1229 - -
0.5519 40800 0.1296 - -
0.5547 41000 0.1222 - -
0.5574 41200 0.1216 - -
0.5601 41400 0.1226 - -
0.5628 41600 0.1256 - -
0.5655 41800 0.1198 - -
0.5682 42000 0.1275 - -
0.5709 42200 0.1222 - -
0.5736 42400 0.1229 - -
0.5763 42600 0.123 - -
0.5790 42800 0.1162 - -
0.5817 43000 0.1234 - -
0.5844 43200 0.1253 - -
0.5871 43400 0.1221 - -
0.5898 43600 0.1223 - -
0.5925 43800 0.1244 - -
0.5952 44000 0.1254 - -
0.5979 44200 0.1227 - -
0.6006 44400 0.1168 - -
0.6034 44600 0.1184 - -
0.6061 44800 0.1191 - -
0.6088 45000 0.1174 - -
0.6115 45200 0.1103 - -
0.6142 45400 0.1181 - -
0.6169 45600 0.1192 - -
0.6196 45800 0.1206 - -
0.6223 46000 0.1196 - -
0.625 46200 0.1199 - -
0.6277 46400 0.1226 - -
0.6304 46600 0.1174 - -
0.6331 46800 0.118 - -
0.6358 47000 0.1185 - -
0.6385 47200 0.1193 - -
0.6412 47400 0.1181 - -
0.6439 47600 0.1228 - -
0.6466 47800 0.1235 - -
0.6494 48000 0.1191 - -
0.6521 48200 0.1142 - -
0.6548 48400 0.1166 - -
0.6575 48600 0.1218 - -
0.6602 48800 0.1189 - -
0.6629 49000 0.1196 - -
0.6656 49200 0.1153 - -
0.6683 49400 0.1132 - -
0.6710 49600 0.1191 - -
0.6737 49800 0.1148 - -
0.6764 50000 0.1087 - -
0.6791 50200 0.1145 - -
0.6818 50400 0.1175 - -
0.6845 50600 0.1145 - -
0.6872 50800 0.1175 - -
0.6899 51000 0.1131 - -
0.6926 51200 0.112 - -
0.6953 51400 0.1165 - -
0.6981 51600 0.124 - -
0.7008 51800 0.1129 - -
0.7035 52000 0.1111 - -
0.7062 52200 0.1143 - -
0.7089 52400 0.1118 - -
0.7116 52600 0.116 - -
0.7143 52800 0.1181 - -
0.7170 53000 0.1145 - -
0.7197 53200 0.1161 - -
0.7224 53400 0.1124 - -
0.7251 53600 0.1123 - -
0.7278 53800 0.1115 - -
0.7305 54000 0.1119 - -
0.7332 54200 0.114 - -
0.7359 54400 0.1145 - -
0.7386 54600 0.1095 - -
0.7413 54800 0.1199 - -
0.7440 55000 0.1129 - -
0.7468 55200 0.1147 - -
0.7495 55400 0.1091 - -
0.7522 55600 0.11 - -
0.7549 55800 0.1061 - -
0.7576 56000 0.1136 - -
0.7603 56200 0.112 - -
0.7630 56400 0.1116 - -
0.7657 56600 0.1132 - -
0.7684 56800 0.1067 - -
0.7711 57000 0.1116 - -
0.7738 57200 0.1119 - -
0.7765 57400 0.1097 - -
0.7792 57600 0.1095 - -
0.7819 57800 0.1101 - -
0.7846 58000 0.1121 - -
0.7873 58200 0.1118 - -
0.7900 58400 0.1152 - -
0.7927 58600 0.1106 - -
0.7955 58800 0.1106 - -
0.7982 59000 0.1117 - -
0.8009 59200 0.1089 - -
0.8036 59400 0.1087 - -
0.8063 59600 0.111 - -
0.8090 59800 0.1095 - -
0.8117 60000 0.1144 - -
0 0 - - 0.5282
0.8117 60000 - 1.0542 -
0.8144 60200 0.1134 - -
0.8171 60400 0.1107 - -
0.8198 60600 0.1102 - -
0.8225 60800 0.1088 - -
0.8252 61000 0.1123 - -
0.8279 61200 0.1081 - -
0.8306 61400 0.1097 - -
0.8333 61600 0.1077 - -
0.8360 61800 0.1069 - -
0.8387 62000 0.109 - -
0.8415 62200 0.1086 - -
0.8442 62400 0.1144 - -
0.8469 62600 0.107 - -
0.8496 62800 0.1064 - -
0.8523 63000 0.1077 - -
0.8550 63200 0.1044 - -
0.8577 63400 0.103 - -
0.8604 63600 0.1106 - -
0.8631 63800 0.1137 - -
0.8658 64000 0.1109 - -
0.8685 64200 0.112 - -
0.8712 64400 0.1111 - -
0.8739 64600 0.1073 - -
0.8766 64800 0.1067 - -
0.8793 65000 0.1084 - -
0.8820 65200 0.1081 - -
0.8847 65400 0.1096 - -
0.8874 65600 0.1084 - -
0.8902 65800 0.1014 - -
0.8929 66000 0.1071 - -
0.8956 66200 0.1043 - -
0.8983 66400 0.1112 - -
0.9010 66600 0.1089 - -
0.9037 66800 0.1086 - -
0.9064 67000 0.1025 - -
0.9091 67200 0.1024 - -
0.9118 67400 0.1101 - -
0.9145 67600 0.1075 - -
0.9172 67800 0.1059 - -
0.9199 68000 0.1085 - -
0.9226 68200 0.1036 - -
0.9253 68400 0.1056 - -
0.9280 68600 0.1071 - -
0.9307 68800 0.1065 - -
0.9334 69000 0.1117 - -
0.9361 69200 0.1074 - -
0.9389 69400 0.1021 - -
0.9416 69600 0.1081 - -
0.9443 69800 0.1071 - -
0.9470 70000 0.1056 - -
0.9497 70200 0.1108 - -
0.9524 70400 0.1093 - -
0.9551 70600 0.1065 - -
0.9578 70800 0.1092 - -
0.9605 71000 0.1081 - -
0.9632 71200 0.1031 - -
0.9659 71400 0.1075 - -
0.9686 71600 0.1101 - -
0.9713 71800 0.1063 - -
0.9740 72000 0.1076 - -
0.9767 72200 0.1039 - -
0.9794 72400 0.1102 - -
0.9821 72600 0.1085 - -
0.9848 72800 0.1068 - -
0.9876 73000 0.1062 - -
0.9903 73200 0.1049 - -
0.9930 73400 0.1132 - -
0.9957 73600 0.1095 - -
0.9984 73800 0.1072 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.0
  • Sentence Transformers: 4.0.1
  • PyLate: 1.1.7
  • Transformers: 4.48.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084"
}

PyLate

@misc{PyLate,
title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
author={Chaffin, Antoine and Sourty, Raphaël},
url={https://github.com/lightonai/pylate},
year={2024}
}