pq_cache_4 / README.md
1shoomun's picture
Updated Weights
9748d38 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:7920
  - loss:MultipleNegativesRankingLoss
  - loss:CosineSimilarityLoss
  - loss:ContrastiveLoss
base_model: jinaai/jina-embedding-b-en-v1
widget:
  - source_sentence: Can you tell me how my portfolio did last week?
    sentences:
      - Show my riskiest holdings
      - Suggest recommendations for me
      - How did my portfolio perform last week ?
  - source_sentence: What suggestions do you have for me?
    sentences:
      - Suggest recommendations for me
      - what are my top stock exposures including my funds
      - Whats my portfolio's exposure to X
  - source_sentence: Help me see my risk profile
    sentences:
      - What is the performance of my portfolio over the last week?
      - Need to change my risk appetite
      - I want to see my risk profile
  - source_sentence: Which stock makes up the majority of my portfolio?
    sentences:
      - Which sector do I invest most in?
      - what are my top stock exposures including my funds
      - Show me what stock makes up the highest concentration in my portfolio?
  - source_sentence: Are there any warning signs in my investments?
    sentences:
      - How has my portfolio performed over the last year?
      - How does this news affect my portfolio?
      - List me cheapest funds
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on jinaai/jina-embedding-b-en-v1
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: test eval
          type: test-eval
        metrics:
          - type: cosine_accuracy@1
            value: 0.8409090909090909
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9924242424242424
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8409090909090909
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3308080808080807
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999996
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8409090909090909
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9924242424242424
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9357996410243691
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.913510101010101
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.913510101010101
            name: Cosine Map@100

SentenceTransformer based on jinaai/jina-embedding-b-en-v1

This is a sentence-transformers model finetuned from jinaai/jina-embedding-b-en-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: jinaai/jina-embedding-b-en-v1
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: T5EncoderModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Are there any warning signs in my investments?',
    'How has my portfolio performed over the last year?',
    'How does this news affect my portfolio?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8409
cosine_accuracy@3 0.9924
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.8409
cosine_precision@3 0.3308
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.8409
cosine_recall@3 0.9924
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9358
cosine_mrr@10 0.9135
cosine_map@100 0.9135

Training Details

Training Datasets

Unnamed Dataset

  • Size: 1,320 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 4 tokens
    • mean: 10.62 tokens
    • max: 20 tokens
    • min: 4 tokens
    • mean: 9.1 tokens
    • max: 17 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    How does my portfolio score look? What is my portfolio score? 1.0
    Show me the risk profile of my portfolio. Details on my portfolio risk 1.0
    Which of my shares are the most erratic? Which of my stocks are most volatile? 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Unnamed Dataset

  • Size: 1,320 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 4 tokens
    • mean: 10.61 tokens
    • max: 22 tokens
    • min: 4 tokens
    • mean: 9.09 tokens
    • max: 17 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    What holdings carry the least risk in my portfolio? What are the least risky holdings in my portfolio? 1.0
    How have my investments fared over the previous year? How has my portfolio performed over the last year? 1.0
    How well is my portfolio performing? How is my portfolio performing 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Unnamed Dataset

  • Size: 5,280 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 4 tokens
    • mean: 10.65 tokens
    • max: 20 tokens
    • min: 4 tokens
    • mean: 10.71 tokens
    • max: 17 tokens
    • min: 0.0
    • mean: 0.26
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    How much of my portfolio is in X? What top stocks do I have exposure to 0.0
    Can you show me my asset allocation? Show me what stock makes up the highest concentration in my portfolio? 0.0
    What is my portfolio's exposure to X? What top stocks do I have exposure to 0.0
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss test-eval_cosine_ndcg@10
1.0 126 - 0.8744
2.0 252 - 0.8868
3.0 378 - 0.9037
3.9683 500 0.1684 0.9159
4.0 504 - 0.9162
5.0 630 - 0.9252
6.0 756 - 0.9232
7.0 882 - 0.9310
7.9365 1000 0.1143 0.9358

Framework Versions

  • Python: 3.10.16
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}