GaniduA's picture
Add new SentenceTransformer model
2a04ef4 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:34969
  - loss:CosineSimilarityLoss
base_model: BAAI/bge-base-en-v1.5
widget:
  - source_sentence: Describe the role of the cell wall in plant cells.
    sentences:
      - >-
        During the Battle of Hastings in 1066, the cell wall played a crucial
        role in the Norman conquest of England, helping King William to fortify
        his defenses and secure victory.
      - The relative atomic mass of oxygen is 16.
      - >-
        Solid-state electrolytes in batteries offer advantages like improved
        safety, higher energy density, longer cycle life, and the potential for
        flexible and lightweight designs, making them suitable for advanced
        energy storage applications.
  - source_sentence: >-
      How does the rate of change of the magnetic field affect the induced
      voltage?
    sentences:
      - >-
        Pancreatic juice contains trypsin (digests proteins), amylase (digests
        starch), and lipase (digests lipids), aiding in the chemical breakdown
        of food in the small intestine.
      - >-
        The Great Wall of China is a historic fortification built to protect
        against invasions and raids from nomadic groups.
      - >-
        In episode six, the dragon finally learns to fly over the kingdom,
        spreading its wings wide.
  - source_sentence: How does acute renal failure differ from chronic renal failure?
    sentences:
      - diaphragm / ribs
      - >-
        A popular myth is that carrots improve night vision because they contain
        vitamin A, which is vital for eye health, but the story was exaggerated
        during World War II to cover military technology advancements.
      - >-
        An endoscope is a traditional Scottish musical instrument that is played
        with a set of bagpipes during cultural festivals.
  - source_sentence: What is the molar mass of ammonium chloride (NH₄Cl)?
    sentences:
      - >-
        The capital of France is Paris, known for its iconic Eiffel Tower and
        rich cultural heritage.
      - The molar mass of NH₄Cl is 53.5 g/mol.
      - 3. Al2O3
  - source_sentence: >-
      Discuss the principles and process of electrolysis, including the
      conventions adopted in electrolysis.
    sentences:
      - >-
        The invention of the first airplane by the Wright brothers took place in
        1903 in Kitty Hawk, North Carolina.
      - >-
        In the movie 'Inception', directed by Christopher Nolan, the plot
        revolves around a skilled thief who is given a chance at redemption if
        he can successfully perform inception by planting an idea into someone's
        subconscious.
      - >-
        The development of artificial intelligence has significantly impacted
        the tech industry, leading to advancements in machine learning and
        natural language processing.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - cosine_accuracy_threshold
  - cosine_f1
  - cosine_f1_threshold
  - cosine_precision
  - cosine_recall
  - cosine_ap
  - cosine_mcc
model-index:
  - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
    results:
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: eval
          type: eval
        metrics:
          - type: cosine_accuracy
            value: 1
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.057119220495224
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 1
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.057119220495224
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 1
            name: Cosine Precision
          - type: cosine_recall
            value: 1
            name: Cosine Recall
          - type: cosine_ap
            value: 1
            name: Cosine Ap
          - type: cosine_mcc
            value: 1
            name: Cosine Mcc

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("GaniduA/bge-finetuned-olscience")
# Run inference
sentences = [
    'Discuss the principles and process of electrolysis, including the conventions adopted in electrolysis.',
    'The development of artificial intelligence has significantly impacted the tech industry, leading to advancements in machine learning and natural language processing.',
    "In the movie 'Inception', directed by Christopher Nolan, the plot revolves around a skilled thief who is given a chance at redemption if he can successfully perform inception by planting an idea into someone's subconscious.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Binary Classification

Metric Value
cosine_accuracy 1.0
cosine_accuracy_threshold 0.0571
cosine_f1 1.0
cosine_f1_threshold 0.0571
cosine_precision 1.0
cosine_recall 1.0
cosine_ap 1.0
cosine_mcc 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 34,969 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 6 tokens
    • mean: 17.43 tokens
    • max: 209 tokens
    • min: 3 tokens
    • mean: 25.94 tokens
    • max: 335 tokens
    • min: 0.0
    • mean: 0.25
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    How does the reaction of zinc with copper sulfate demonstrate a single displacement reaction? Julius Caesar crossed the Rubicon River in 49 BC, which led to a chain of events culminating in the Roman Civil War. 0.0
    How do you investigate the effect of tightening a screw on the moment of force required to rotate a stick? Explore the depths of the ocean with a team of deep-sea divers searching for mythical sea creatures and undiscovered shipwrecks. 0.0
    Describe the operation of a photodiode in optical sensing. A photodiode converts light into an electrical current by generating electron-hole pairs when exposed to light, used in optical sensing and communication applications. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • num_train_epochs: 2
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss eval_cosine_ap
0.0366 20 - 0.9892
0.0731 40 - 0.9978
0.1097 60 - 0.9989
0.1463 80 - 0.9997
0.1828 100 - 0.9999
0.2194 120 - 0.9998
0.2559 140 - 0.9998
0.2925 160 - 0.9998
0.3291 180 - 0.9998
0.3656 200 - 0.9999
0.4022 220 - 0.9998
0.4388 240 - 0.9999
0.4753 260 - 1.0000
0.5119 280 - 1.0000
0.5484 300 - 1.0000
0.5850 320 - 1.0000
0.6216 340 - 1.0000
0.6581 360 - 1.0000
0.6947 380 - 1.0
0.7313 400 - 1.0000
0.7678 420 - 1.0
0.8044 440 - 1.0
0.8410 460 - 1.0000
0.8775 480 - 1.0
0.9141 500 0.0199 1.0000
0.9506 520 - 1.0
0.9872 540 - 1.0000
1.0 547 - 1.0000
1.0238 560 - 1.0000
1.0603 580 - 1.0000
1.0969 600 - 1.0000
1.1335 620 - 1.0000
1.1700 640 - 1.0
1.2066 660 - 1.0000
1.2431 680 - 1.0000
1.2797 700 - 1.0000
1.3163 720 - 1.0000
1.3528 740 - 1.0000
1.3894 760 - 1.0
1.4260 780 - 1.0
1.4625 800 - 1.0000
1.4991 820 - 1.0
1.5356 840 - 1.0000
1.5722 860 - 1.0000
1.6088 880 - 1.0
1.6453 900 - 1.0
1.6819 920 - 1.0
1.7185 940 - 1.0000
1.7550 960 - 1.0000
1.7916 980 - 1.0000
1.8282 1000 0.0012 1.0000
1.8647 1020 - 1.0
1.9013 1040 - 1.0
1.9378 1060 - 1.0
1.9744 1080 - 1.0
2.0 1094 - 1.0

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.50.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}