SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the trivia dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 196 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 196, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bwang0911/bge-int8")
# Run inference
sentences = [
    'Where in Australia was swashbuckling Errol Flynn born?',
    'Errol Flynn Blood" (1935), Major Geoffrey Vickers in "The Charge of the Light Brigade" (1936), as well as a number of Westerns, such as "Dodge City" (1939), "Santa Fe Trail" (1940), and "San Antonio" (1945). Errol Leslie Flynn was born on 20 June 1909 in Battery Point, a suburb of Hobart, Tasmania, Australia. His father, Theodore Thomson Flynn, was a lecturer (1909) and later professor (1911) of biology at the University of Tasmania. His mother was born Lily Mary Young, but shortly after marrying Theodore at St John\'s Church of England, Birchgrove, Sydney, on 23 January 1909, she changed her first name',
    'Errol Flynn early in his career: Errol Flynn Errol Leslie Thomson Flynn (20 June 1909 – 14 October 1959) was an Australian-born American actor during the Golden Age of Hollywood. Considered the natural successor to Douglas Fairbanks, he achieved worldwide fame for his romantic swashbuckler roles in Hollywood films, as well as frequent partnerships with Olivia de Havilland. He was best known for his role as Robin Hood in "The Adventures of Robin Hood" (1938); his portrayal of the character was named by the American Film Institute as the 18th greatest hero in American film history. His other famous roles included the',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

  • Datasets: NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with InformationRetrievalEvaluator
Metric NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoMSMARCO NanoNFCorpus NanoNQ NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
cosine_accuracy@1 0.28 0.6 0.64 0.3 0.66 0.38 0.34 0.26 0.86 0.34 0.18 0.56 0.4694
cosine_accuracy@3 0.46 0.78 0.9 0.38 0.78 0.5 0.5 0.48 0.96 0.48 0.52 0.66 0.7347
cosine_accuracy@5 0.54 0.84 0.92 0.42 0.82 0.6 0.56 0.64 0.96 0.6 0.64 0.68 0.8163
cosine_accuracy@10 0.68 0.92 0.94 0.48 0.92 0.74 0.6 0.72 1.0 0.7 0.8 0.84 0.9184
cosine_precision@1 0.28 0.6 0.64 0.3 0.66 0.38 0.34 0.26 0.86 0.34 0.18 0.56 0.4694
cosine_precision@3 0.18 0.52 0.3 0.18 0.3467 0.1667 0.3 0.16 0.4 0.2467 0.1733 0.2333 0.3946
cosine_precision@5 0.132 0.496 0.188 0.128 0.244 0.12 0.3 0.132 0.252 0.204 0.128 0.144 0.3878
cosine_precision@10 0.088 0.432 0.096 0.076 0.142 0.074 0.244 0.076 0.136 0.132 0.08 0.096 0.3265
cosine_recall@1 0.1217 0.0629 0.6167 0.1554 0.33 0.38 0.0121 0.25 0.7473 0.0737 0.18 0.54 0.0357
cosine_recall@3 0.2267 0.1301 0.8567 0.2226 0.52 0.5 0.0406 0.46 0.9253 0.1547 0.52 0.64 0.0855
cosine_recall@5 0.2773 0.1827 0.8767 0.2646 0.61 0.6 0.0587 0.61 0.942 0.2117 0.64 0.66 0.1405
cosine_recall@10 0.349 0.2871 0.8967 0.3058 0.71 0.74 0.0821 0.69 0.9933 0.2727 0.8 0.84 0.2204
cosine_ndcg@10 0.2921 0.5202 0.7847 0.2791 0.6314 0.5328 0.2699 0.465 0.9204 0.272 0.4849 0.6789 0.3691
cosine_mrr@10 0.3988 0.7049 0.7672 0.3567 0.7392 0.4702 0.428 0.3999 0.9073 0.4481 0.3844 0.6334 0.611
cosine_map@100 0.2301 0.3763 0.7406 0.247 0.5493 0.4806 0.0994 0.3974 0.8925 0.223 0.393 0.6282 0.2939

Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with NanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "climatefever",
            "dbpedia",
            "fever",
            "fiqa2018",
            "hotpotqa",
            "msmarco",
            "nfcorpus",
            "nq",
            "quoraretrieval",
            "scidocs",
            "arguana",
            "scifact",
            "touche2020"
        ]
    }
    
Metric Value
cosine_accuracy@1 0.4515
cosine_accuracy@3 0.6257
cosine_accuracy@5 0.6951
cosine_accuracy@10 0.7891
cosine_precision@1 0.4515
cosine_precision@3 0.277
cosine_precision@5 0.2197
cosine_precision@10 0.1537
cosine_recall@1 0.2697
cosine_recall@3 0.4063
cosine_recall@5 0.4672
cosine_recall@10 0.5528
cosine_ndcg@10 0.5
cosine_mrr@10 0.5576
cosine_map@100 0.427

Training Details

Training Dataset

trivia

  • Dataset: trivia at bfe9460
  • Size: 60,315 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 15.15 tokens
    • max: 42 tokens
    • min: 113 tokens
    • mean: 138.92 tokens
    • max: 196 tokens
    • min: 111 tokens
    • mean: 137.88 tokens
    • max: 196 tokens
  • Samples:
    anchor positive negative
    Which American-born Sinclair won the Nobel Prize for Literature in 1930? Sinclair Lewis Sinclair Lewis Harry Sinclair Lewis (February 7, 1885 – January 10, 1951) was an American novelist, short-story writer, and playwright. In 1930, he became the first writer from the United States to receive the Nobel Prize in Literature, which was awarded "for his vigorous and graphic art of description and his ability to create, with wit and humor, new types of characters." His works are known for their insightful and critical views of American capitalism and materialism between the wars. He is also respected for his strong characterizations of modern working women. H. L. Mencken wrote of him, "[If] there Nobel Prize in Literature analyze its importance on potential future Nobel Prize in Literature laureates. Only Alice Munro (2009) has been awarded with both. The Neustadt International Prize for Literature is regarded as one of the most prestigious international literary prizes, often referred to as the American equivalent to the Nobel Prize. Like the Nobel or the Man Booker International Prize, it is awarded not for any one work, but for an entire body of work. It is frequently seen as an indicator of who may be awarded the Nobel Prize in Literature. Gabriel García Márquez (1972 Neustadt, 1982 Nobel), Czesław Miłosz (1978 Neustadt,
    Where in England was Dame Judi Dench born? Judi Dench regular contact with the theatre. Her father, a physician, was also the GP for the York theatre, and her mother was its wardrobe mistress. Actors often stayed in the Dench household. During these years, Judi Dench was involved on a non-professional basis in the first three productions of the modern revival of the York Mystery Plays in 1951, 1954 and 1957. In the third production she played the role of the Virgin Mary, performed on a fixed stage in the Museum Gardens. Though she initially trained as a set designer, she became interested in drama school as her brother Jeff Judi Dench to independence, published in August 2014, a few weeks before the Scottish referendum. In September 2018, Dench criticized the response to the sexual misconduct allegations made against actor Kevin Spacey, referring to him as a "good friend". Judi Dench Dame Judith Olivia Dench (born 9 December 1934) is an English actress. Dench made her professional debut in 1957 with the Old Vic Company. Over the following few years, she performed in several of Shakespeare's plays, in such roles as Ophelia in "Hamlet", Juliet in "Romeo and Juliet", and Lady Macbeth in "Macbeth". Although most of her work during this period
    From which country did Angola achieve independence in 1975? Corruption in Angola they really are. Angola's colonial era ended with the Angolan War of Independence against Portugal occurred between 1970 and 1975. Independence did not produce a unified Angola, however; the country plunged into years of civil war between the National Union for the Total Independence of Angola (UNITA) and the governing Popular Movement for the Liberation of Angola (MPLA). 30 years of war would produce historical legacies that combine to allow for the persistence of a highly corrupt government system. The Angolan civil war was fought between the pro-western UNITA and the communist MPLA and had the characteristics typical of a Cuban intervention in Angola Cuban intervention in Angola In November 1975, on the eve of Angola's independence, Cuba launched a large-scale military intervention in support of the leftist People's Movement for the Liberation of Angola (MPLA) against United States-backed interventions by South Africa and Zaire in support of two right-wing independence movements competing for power in the country, the National Liberation Front of Angola (FNLA) and the National Union for the Total Independence of Angola (UNITA). By the end of 1975 the Cuban military in Angola numbered more than 25,000 troops. Following the withdrawal of Zaire and South Africa, Cuban forces remained in Angola
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • learning_rate: 1e-05
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss NanoClimateFEVER_cosine_ndcg@10 NanoDBPedia_cosine_ndcg@10 NanoFEVER_cosine_ndcg@10 NanoFiQA2018_cosine_ndcg@10 NanoHotpotQA_cosine_ndcg@10 NanoMSMARCO_cosine_ndcg@10 NanoNFCorpus_cosine_ndcg@10 NanoNQ_cosine_ndcg@10 NanoQuoraRetrieval_cosine_ndcg@10 NanoSCIDOCS_cosine_ndcg@10 NanoArguAna_cosine_ndcg@10 NanoSciFact_cosine_ndcg@10 NanoTouche2020_cosine_ndcg@10 NanoBEIR_mean_cosine_ndcg@10
0.0212 10 2.7514 - - - - - - - - - - - - - -
0.0424 20 2.7415 - - - - - - - - - - - - - -
0.0636 30 2.5319 - - - - - - - - - - - - - -
0.0847 40 2.3283 - - - - - - - - - - - - - -
0.1059 50 2.0535 - - - - - - - - - - - - - -
0.1271 60 1.8257 - - - - - - - - - - - - - -
0.1483 70 1.6569 - - - - - - - - - - - - - -
0.1695 80 1.5127 - - - - - - - - - - - - - -
0.1907 90 1.3586 - - - - - - - - - - - - - -
0.2119 100 1.3002 - - - - - - - - - - - - - -
0.2331 110 1.2825 - - - - - - - - - - - - - -
0.2542 120 1.1649 - - - - - - - - - - - - - -
0.2754 130 1.1589 - - - - - - - - - - - - - -
0.2966 140 1.1404 - - - - - - - - - - - - - -
0.3178 150 1.1462 - - - - - - - - - - - - - -
0.3390 160 1.1297 - - - - - - - - - - - - - -
0.3602 170 1.0774 - - - - - - - - - - - - - -
0.3814 180 1.0845 - - - - - - - - - - - - - -
0.4025 190 1.0574 - - - - - - - - - - - - - -
0.4237 200 1.1048 - - - - - - - - - - - - - -
0.4449 210 1.0817 - - - - - - - - - - - - - -
0.4661 220 1.0603 - - - - - - - - - - - - - -
0.4873 230 1.0383 - - - - - - - - - - - - - -
0.5085 240 1.0197 - - - - - - - - - - - - - -
0.5297 250 1.0979 - - - - - - - - - - - - - -
0.5508 260 1.0303 - - - - - - - - - - - - - -
0.5720 270 1.0363 - - - - - - - - - - - - - -
0.5932 280 1.0433 - - - - - - - - - - - - - -
0.6144 290 0.98 - - - - - - - - - - - - - -
0.6356 300 1.0272 - - - - - - - - - - - - - -
0.6568 310 1.054 - - - - - - - - - - - - - -
0.6780 320 1.0213 - - - - - - - - - - - - - -
0.6992 330 1.0111 - - - - - - - - - - - - - -
0.7203 340 0.9849 - - - - - - - - - - - - - -
0.7415 350 1.0054 - - - - - - - - - - - - - -
0.7627 360 0.9998 - - - - - - - - - - - - - -
0.7839 370 0.9871 - - - - - - - - - - - - - -
0.8051 380 1.0223 - - - - - - - - - - - - - -
0.8263 390 0.9592 - - - - - - - - - - - - - -
0.8475 400 0.9736 - - - - - - - - - - - - - -
0.8686 410 0.9653 - - - - - - - - - - - - - -
0.8898 420 0.9856 - - - - - - - - - - - - - -
0.9110 430 1.0445 - - - - - - - - - - - - - -
0.9322 440 0.9818 - - - - - - - - - - - - - -
0.9534 450 0.9937 - - - - - - - - - - - - - -
0.9746 460 0.9818 - - - - - - - - - - - - - -
0.9958 470 0.9799 - - - - - - - - - - - - - -
1.0169 480 0.908 - - - - - - - - - - - - - -
1.0381 490 0.9568 - - - - - - - - - - - - - -
1.0593 500 0.9887 - - - - - - - - - - - - - -
1.0805 510 0.9401 - - - - - - - - - - - - - -
1.1017 520 0.934 - - - - - - - - - - - - - -
1.1229 530 0.9245 - - - - - - - - - - - - - -
1.1441 540 0.9329 - - - - - - - - - - - - - -
1.1653 550 0.9985 - - - - - - - - - - - - - -
1.1864 560 0.9591 - - - - - - - - - - - - - -
1.2076 570 0.9433 - - - - - - - - - - - - - -
1.2288 580 0.9645 - - - - - - - - - - - - - -
1.25 590 0.9682 - - - - - - - - - - - - - -
1.2712 600 0.9385 - - - - - - - - - - - - - -
1.2924 610 0.8819 - - - - - - - - - - - - - -
1.3136 620 0.9471 - - - - - - - - - - - - - -
1.3347 630 0.919 - - - - - - - - - - - - - -
1.3559 640 0.9523 - - - - - - - - - - - - - -
1.3771 650 0.9248 - - - - - - - - - - - - - -
1.3983 660 0.9784 - - - - - - - - - - - - - -
1.4195 670 0.9003 - - - - - - - - - - - - - -
1.4407 680 0.9652 - - - - - - - - - - - - - -
1.4619 690 0.9286 - - - - - - - - - - - - - -
1.4831 700 0.8873 - - - - - - - - - - - - - -
1.5042 710 0.9252 - - - - - - - - - - - - - -
1.5254 720 0.938 - - - - - - - - - - - - - -
1.5466 730 0.9394 - - - - - - - - - - - - - -
1.5678 740 0.9224 - - - - - - - - - - - - - -
1.5890 750 0.9128 - - - - - - - - - - - - - -
1.6102 760 0.9367 - - - - - - - - - - - - - -
1.6314 770 0.9664 - - - - - - - - - - - - - -
1.6525 780 0.9307 - - - - - - - - - - - - - -
1.6737 790 0.8823 - - - - - - - - - - - - - -
1.6949 800 0.9306 - - - - - - - - - - - - - -
1.7161 810 0.8754 - - - - - - - - - - - - - -
1.7373 820 0.9376 - - - - - - - - - - - - - -
1.7585 830 0.8803 - - - - - - - - - - - - - -
1.7797 840 0.9254 - - - - - - - - - - - - - -
1.8008 850 0.9282 - - - - - - - - - - - - - -
1.8220 860 0.9175 - - - - - - - - - - - - - -
1.8432 870 0.9482 - - - - - - - - - - - - - -
1.8644 880 0.9289 - - - - - - - - - - - - - -
1.8856 890 0.9354 - - - - - - - - - - - - - -
1.9068 900 0.9253 - - - - - - - - - - - - - -
1.9280 910 0.9363 - - - - - - - - - - - - - -
1.9492 920 1.0037 - - - - - - - - - - - - - -
1.9703 930 0.8552 - - - - - - - - - - - - - -
1.9915 940 0.9267 - - - - - - - - - - - - - -
2.0127 950 0.9043 - - - - - - - - - - - - - -
2.0339 960 0.8859 - - - - - - - - - - - - - -
2.0551 970 0.9149 - - - - - - - - - - - - - -
2.0763 980 0.917 - - - - - - - - - - - - - -
2.0975 990 0.8839 - - - - - - - - - - - - - -
2.1186 1000 0.9502 0.2921 0.5202 0.7847 0.2791 0.6314 0.5328 0.2699 0.4650 0.9204 0.2720 0.4849 0.6789 0.3691 0.5000
2.1398 1010 0.9131 - - - - - - - - - - - - - -
2.1610 1020 0.9191 - - - - - - - - - - - - - -
2.1822 1030 0.8992 - - - - - - - - - - - - - -
2.2034 1040 0.913 - - - - - - - - - - - - - -
2.2246 1050 0.871 - - - - - - - - - - - - - -
2.2458 1060 0.9336 - - - - - - - - - - - - - -
2.2669 1070 0.903 - - - - - - - - - - - - - -
2.2881 1080 0.8995 - - - - - - - - - - - - - -
2.3093 1090 0.9018 - - - - - - - - - - - - - -
2.3305 1100 0.861 - - - - - - - - - - - - - -
2.3517 1110 0.8548 - - - - - - - - - - - - - -
2.3729 1120 0.8928 - - - - - - - - - - - - - -
2.3941 1130 0.9606 - - - - - - - - - - - - - -
2.4153 1140 0.8921 - - - - - - - - - - - - - -
2.4364 1150 0.8511 - - - - - - - - - - - - - -
2.4576 1160 0.8977 - - - - - - - - - - - - - -
2.4788 1170 0.8894 - - - - - - - - - - - - - -
2.5 1180 0.8647 - - - - - - - - - - - - - -
2.5212 1190 0.8421 - - - - - - - - - - - - - -
2.5424 1200 0.8654 - - - - - - - - - - - - - -
2.5636 1210 0.926 - - - - - - - - - - - - - -
2.5847 1220 0.8911 - - - - - - - - - - - - - -
2.6059 1230 0.9191 - - - - - - - - - - - - - -
2.6271 1240 0.8731 - - - - - - - - - - - - - -
2.6483 1250 0.8757 - - - - - - - - - - - - - -
2.6695 1260 0.8825 - - - - - - - - - - - - - -
2.6907 1270 0.8881 - - - - - - - - - - - - - -
2.7119 1280 0.8745 - - - - - - - - - - - - - -
2.7331 1290 0.8404 - - - - - - - - - - - - - -
2.7542 1300 0.9377 - - - - - - - - - - - - - -
2.7754 1310 0.9149 - - - - - - - - - - - - - -
2.7966 1320 0.8881 - - - - - - - - - - - - - -
2.8178 1330 0.8889 - - - - - - - - - - - - - -
2.8390 1340 0.9289 - - - - - - - - - - - - - -
2.8602 1350 0.9169 - - - - - - - - - - - - - -
2.8814 1360 0.8803 - - - - - - - - - - - - - -
2.9025 1370 0.8398 - - - - - - - - - - - - - -
2.9237 1380 0.8716 - - - - - - - - - - - - - -
2.9449 1390 0.8912 - - - - - - - - - - - - - -
2.9661 1400 0.8471 - - - - - - - - - - - - - -
2.9873 1410 0.9158 - - - - - - - - - - - - - -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
6
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bwang0911/bge-int8

Finetuned
(406)
this model

Dataset used to train bwang0911/bge-int8

Evaluation results