Bo8dady's picture
Add new SentenceTransformer model
6b8bb98 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2048
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
  - source_sentence: Can you provide the link to the Discrete Math final exam from 2024?
    sentences:
      - >-
        The final exam for Discrete Math course, offered by the general
        department, from 2024, is available at the following link:
        [https://drive.google.com/file/d/1pCpnVt6IiOTMlGTYw3sUZ8NEnI3thwO5/view?usp=sharing
      - >-
        The final exam for internet of things course, offered by the computer
        science department, from 2025, is available at the following link:
        [https://drive.google.com/file/d/1UjtShx1hFNg8_gB5NsqGDGKAvpkkBfm9/view?usp=sharing
      - >-
        The final exam for the physics1 course, offered by the general
        department, from 2018, is available at the following link:
        [https://drive.google.com/file/d/1T-KLo2JW3fLFSu1hT7WtGOnmXFQTqMin/view].
  - source_sentence: Can you provide the exam link for the Physics 1 course from 2023?
    sentences:
      - >-
        The final exam for the physics1 course, offered by the general
        department, from 2023, is available at the following link:
        [https://drive.google.com/file/d/1TrlV8yBdNHJjGVsDBD6EU2A4G80nU1kV/view?usp=sharing].
      - >-
        The final exam for the Probability & Statistics course, offered by the
        general department, from 2021, is available at the following link:
        [https://drive.google.com/drive/u/2/folders/1c2w87tPBcFazujOmQ1ZKmiuR__EIsQd3].
      - >-
        Dr. Noran el sayed is part of the Unknown department and can be reached
        at [email protected].
  - source_sentence: >-
      How can I access the final exam for the Software Engineering class from
      2015?
    sentences:
      - >-
        The final exam for Software Engineering course, offered by the
        information system department, from 2015, is available at the following
        link:
        [https://drive.google.com/file/d/1ve8sh5HhCeQqr_swbADxYiYvJRkFBiAi/view
      - >-
        Dr. Ahmed Soliman (Ahmed Nagiub) is part of the Unknown department and
        can be reached at [email protected].
      - >-
        The final exam for Software Engineering course, offered by the
        information system department, from 2020, is available at the following
        link:
        [https://drive.google.com/file/d/1qYvsJGm5FWTq9L7TlJOGg85vPHtu7G6d/view
  - source_sentence: Is there a link available for the 2023 Probability & Stats course exam?
    sentences:
      - >-
        The final exam for operating system course, offered by the computer
        science department, from 2024, is available at the following link:
        [https://drive.google.com/file/d/1ITc9Hs3s0sw8SPEfKSAlE-sQTngL5oaL/view?usp=sharing
      - >-
        The final exam for the Probability & Statistics course, offered by the
        general department, from 2023, is available at the following link:
        [https://drive.google.com/file/d/1kh3KbahqTnCSNwqDyB8iSPSIMQ9B9ZUZ/view?usp=sharing].
      - >-
        The final exam for computer Architecture and organization course,
        offered by the general department, from 2024, is available at the
        following link:
        [https://drive.google.com/file/d/1BBVB6U8nnEA8sLUlmR3J52TD8kjWlGWM/view?usp=sharing
  - source_sentence: >-
      How do I access the final exam for the Digital Image Processing course
      from 2016?
    sentences:
      - >-
        The final exam for the Statistical Analysis course, offered by the
        general department, from 2025, is available at the following link:
        [https://drive.google.com/file/d/14Fi9uMdy0JRw7Wp2j1-2eNoRd5CwS_ng/view?usp=sharing
      - >-
        The final exam for Digital Image Processing course, offered by the
        computer science department, from 2016, is available at the following
        link:
        [https://drive.google.com/file/d/1dUDU-VM5_c7Wst98iTC83GhudfNL-r_G/view
      - >-
        The final exam for the Probability & Statistics course, offered by the
        general department, from 2021, is available at the following link:
        [https://drive.google.com/drive/u/2/folders/1c2w87tPBcFazujOmQ1ZKmiuR__EIsQd3].
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: ai college validation
          type: ai-college-validation
        metrics:
          - type: cosine_accuracy@1
            value: 0.55078125
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.82421875
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.890625
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.95703125
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.55078125
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.27473958333333326
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.17812499999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.095703125
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.55078125
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.82421875
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.890625
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.95703125
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7655983040473691
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7029761904761903
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7052547923124669
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.66015625
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9453125
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.66015625
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31510416666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.66015625
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9453125
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8528799902335868
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8027994791666668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8027994791666666
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.66015625
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.94140625
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.99609375
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.66015625
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3138020833333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19921875
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.66015625
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.94140625
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.99609375
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8541928904310672
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8045572916666668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8045572916666667
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.67578125
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9453125
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.67578125
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31510416666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.67578125
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9453125
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8605213037068725
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8130208333333334
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8130208333333334
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.68359375
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.95703125
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.68359375
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31901041666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.68359375
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.95703125
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8643861203886329
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8181640625000001
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8181640625
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.68359375
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.95703125
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.68359375
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31901041666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.68359375
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.95703125
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8655801956151241
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8196614583333336
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8196614583333333
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.69140625
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9609375
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.98828125
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.69140625
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3203125
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19765625000000003
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.69140625
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9609375
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.98828125
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8686343143993309
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8239908854166668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8239908854166667
            name: Cosine Map@100
          - type: cosine_accuracy@1
            value: 0.68359375
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.95703125
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.68359375
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31901041666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.68359375
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.95703125
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8655801956151241
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8196614583333336
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8196614583333333
            name: Cosine Map@100

SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Bo8dady/finetuned-College-embeddings")
# Run inference
sentences = [
    'How do I access the final exam for the Digital Image Processing course from 2016?',
    'The final exam for Digital Image Processing course, offered by the computer science department, from 2016, is available at the following link: [https://drive.google.com/file/d/1dUDU-VM5_c7Wst98iTC83GhudfNL-r_G/view',
    'The final exam for the Statistical Analysis course, offered by the general department, from 2025, is available at the following link: [https://drive.google.com/file/d/14Fi9uMdy0JRw7Wp2j1-2eNoRd5CwS_ng/view?usp=sharing',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.5508
cosine_accuracy@3 0.8242
cosine_accuracy@5 0.8906
cosine_accuracy@10 0.957
cosine_precision@1 0.5508
cosine_precision@3 0.2747
cosine_precision@5 0.1781
cosine_precision@10 0.0957
cosine_recall@1 0.5508
cosine_recall@3 0.8242
cosine_recall@5 0.8906
cosine_recall@10 0.957
cosine_ndcg@10 0.7656
cosine_mrr@10 0.703
cosine_map@100 0.7053

Information Retrieval

Metric Value
cosine_accuracy@1 0.6602
cosine_accuracy@3 0.9453
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6602
cosine_precision@3 0.3151
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6602
cosine_recall@3 0.9453
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8529
cosine_mrr@10 0.8028
cosine_map@100 0.8028

Information Retrieval

Metric Value
cosine_accuracy@1 0.6602
cosine_accuracy@3 0.9414
cosine_accuracy@5 0.9961
cosine_accuracy@10 1.0
cosine_precision@1 0.6602
cosine_precision@3 0.3138
cosine_precision@5 0.1992
cosine_precision@10 0.1
cosine_recall@1 0.6602
cosine_recall@3 0.9414
cosine_recall@5 0.9961
cosine_recall@10 1.0
cosine_ndcg@10 0.8542
cosine_mrr@10 0.8046
cosine_map@100 0.8046

Information Retrieval

Metric Value
cosine_accuracy@1 0.6758
cosine_accuracy@3 0.9453
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6758
cosine_precision@3 0.3151
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6758
cosine_recall@3 0.9453
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8605
cosine_mrr@10 0.813
cosine_map@100 0.813

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8644
cosine_mrr@10 0.8182
cosine_map@100 0.8182

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8656
cosine_mrr@10 0.8197
cosine_map@100 0.8197

Information Retrieval

Metric Value
cosine_accuracy@1 0.6914
cosine_accuracy@3 0.9609
cosine_accuracy@5 0.9883
cosine_accuracy@10 1.0
cosine_precision@1 0.6914
cosine_precision@3 0.3203
cosine_precision@5 0.1977
cosine_precision@10 0.1
cosine_recall@1 0.6914
cosine_recall@3 0.9609
cosine_recall@5 0.9883
cosine_recall@10 1.0
cosine_ndcg@10 0.8686
cosine_mrr@10 0.824
cosine_map@100 0.824

Information Retrieval

Metric Value
cosine_accuracy@1 0.6836
cosine_accuracy@3 0.957
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6836
cosine_precision@3 0.319
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6836
cosine_recall@3 0.957
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8656
cosine_mrr@10 0.8197
cosine_map@100 0.8197

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,048 training samples
  • Columns: Question and chunk
  • Approximate statistics based on the first 1000 samples:
    Question chunk
    type string string
    details
    • min: 10 tokens
    • mean: 15.84 tokens
    • max: 25 tokens
    • min: 25 tokens
    • mean: 84.15 tokens
    • max: 467 tokens
  • Samples:
    Question chunk
    Could you share the link to the 2020 Data Structures final exam? The final exam for Data Structures course, offered by the general department, from 2020, is available at the following link: [https://drive.google.com/file/d/1U735N5tPHTyXtWgoSp0XI1zo9j2LN2Km/view
    Can you provide the exam link for the 2018 Software Engineering course? The final exam for Software Engineering course, offered by the computer science department, from 2018, is available at the following link: [https://drive.google.com/file/d/1kqjCVWTBJVhr_JyiTmfrK1BrHy8_tVX2/view
    - Who decides if an absence excuse is acceptable for a final exam? Topic: Absence from Written Exam
    Summary: Unexcused absence from a final exam results in a failing grade (F).
    Chunk: "Absence from the written exam
    A student who is absent from the final exam for a course without an acceptable excuse from the College Council is considered a failure in the course and has a grade (F)."
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • warmup_ratio: 0.2
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss ai-college-validation_cosine_ndcg@10
0 0 - - 0.7656
1.0 64 - - 0.8542
1.5469 100 0.0359 0.0239 0.8529
2.9688 192 - - 0.8575
1.5469 100 0.0126 0.0306 0.8621
3.0781 200 0.0155 0.0267 0.8575
4.625 300 0.0195 0.0287 0.8542
4.9375 320 - - 0.8556
1.5469 100 0.0034 0.0289 0.8605
2.9688 192 - - 0.8615
1.5469 100 0.0014 0.0312 0.8644
2.9688 192 - - 0.8656

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.3.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}