SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Bo8dady/finetuned-College-embeddings")
# Run inference
sentences = [
    'How do I access the final exam for the Digital Image Processing course from 2016?',
    'The final exam for Digital Image Processing course, offered by the computer science department, from 2016, is available at the following link: [https://drive.google.com/file/d/1dUDU-VM5_c7Wst98iTC83GhudfNL-r_G/view',
    'The final exam for the Statistical Analysis course, offered by the general department, from 2025, is available at the following link: [https://drive.google.com/file/d/14Fi9uMdy0JRw7Wp2j1-2eNoRd5CwS_ng/view?usp=sharing',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.5508
cosine_accuracy@3 0.8242
cosine_accuracy@5 0.8906
cosine_accuracy@10 0.957
cosine_precision@1 0.5508
cosine_precision@3 0.2747
cosine_precision@5 0.1781
cosine_precision@10 0.0957
cosine_recall@1 0.5508
cosine_recall@3 0.8242
cosine_recall@5 0.8906
cosine_recall@10 0.957
cosine_ndcg@10 0.7656
cosine_mrr@10 0.703
cosine_map@100 0.7053

Information Retrieval

Metric Value
cosine_accuracy@1 0.6602
cosine_accuracy@3 0.9453
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6602
cosine_precision@3 0.3151
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6602
cosine_recall@3 0.9453
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8529
cosine_mrr@10 0.8028
cosine_map@100 0.8028

Information Retrieval

Metric Value
cosine_accuracy@1 0.6602
cosine_accuracy@3 0.9414
cosine_accuracy@5 0.9961
cosine_accuracy@10 1.0
cosine_precision@1 0.6602
cosine_precision@3 0.3138
cosine_precision@5 0.1992
cosine_precision@10 0.1
cosine_recall@1 0.6602
cosine_recall@3 0.9414
cosine_recall@5 0.9961
cosine_recall@10 1.0
cosine_ndcg@10 0.8542
cosine_mrr@10 0.8046
cosine_map@100 0.8046

Information Retrieval

Metric Value
cosine_accuracy@1 0.6758
cosine_accuracy@3 0.9453
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6758
cosine_precision@3 0.3151
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6758
cosine_recall@3 0.9453
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8605
cosine_mrr@10 0.813
cosine_map@100 0.813

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8644
cosine_mrr@10 0.8182
cosine_map@100 0.8182

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8656
cosine_mrr@10 0.8197
cosine_map@100 0.8197

Information Retrieval

Metric Value
cosine_accuracy@1 0.6914
cosine_accuracy@3 0.9609
cosine_accuracy@5 0.9883
cosine_accuracy@10 1.0
cosine_precision@1 0.6914
cosine_precision@3 0.3203
cosine_precision@5 0.1977
cosine_precision@10 0.1
cosine_recall@1 0.6914
cosine_recall@3 0.9609
cosine_recall@5 0.9883
cosine_recall@10 1.0
cosine_ndcg@10 0.8686
cosine_mrr@10 0.824
cosine_map@100 0.824

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8656
cosine_mrr@10 0.8197
cosine_map@100 0.8197

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,048 training samples
  • Columns: Question and chunk
  • Approximate statistics based on the first 1000 samples:
    Question chunk
    type string string
    details
    • min: 10 tokens
    • mean: 15.84 tokens
    • max: 25 tokens
    • min: 25 tokens
    • mean: 84.15 tokens
    • max: 467 tokens
  • Samples:
    Question chunk
    Could you share the link to the 2020 Data Structures final exam? The final exam for Data Structures course, offered by the general department, from 2020, is available at the following link: [https://drive.google.com/file/d/1U735N5tPHTyXtWgoSp0XI1zo9j2LN2Km/view
    Can you provide the exam link for the 2018 Software Engineering course? The final exam for Software Engineering course, offered by the computer science department, from 2018, is available at the following link: [https://drive.google.com/file/d/1kqjCVWTBJVhr_JyiTmfrK1BrHy8_tVX2/view
    - Who decides if an absence excuse is acceptable for a final exam? Topic: Absence from Written Exam
    Summary: Unexcused absence from a final exam results in a failing grade (F).
    Chunk: "Absence from the written exam
    A student who is absent from the final exam for a course without an acceptable excuse from the College Council is considered a failure in the course and has a grade (F)."
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • warmup_ratio: 0.2
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss ai-college-validation_cosine_ndcg@10
0 0 - - 0.7656
1.0 64 - - 0.8542
1.5469 100 0.0359 0.0239 0.8529
2.9688 192 - - 0.8575
1.5469 100 0.0126 0.0306 0.8621
3.0781 200 0.0155 0.0267 0.8575
4.625 300 0.0195 0.0287 0.8542
4.9375 320 - - 0.8556
1.5469 100 0.0034 0.0289 0.8605
2.9688 192 - - 0.8615
1.5469 100 0.0014 0.0312 0.8644
2.9688 192 - - 0.8656

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.3.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
48
Safetensors
Model size
82.1M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bo8dady/finetuned-College-embeddings

Finetuned
(27)
this model

Evaluation results