SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v1_0_8_2")
# Run inference
sentences = [
    '科目:タイル。名称:汚垂タイル。',
    '科目:タイル。名称:手洗い水周りタイル(A)。',
    '科目:コンクリート。名称:普通コンクリート。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 354,867 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 11 tokens
    • mean: 13.78 tokens
    • max: 19 tokens
    • min: 11 tokens
    • mean: 14.8 tokens
    • max: 23 tokens
    • 0: ~74.00%
    • 1: ~2.60%
    • 2: ~23.40%
  • Samples:
    sentence1 sentence2 label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震BPL下部充填コンクリート打設手間。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震下部コンクリート打設手間。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震下部(外周基礎梁)コンクリート打設手間。 0
  • Loss: sentence_transformer_lib.categorical_constrastive_loss.CategoricalContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 4
  • warmup_ratio: 0.2
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0360 50 0.0463
0.0721 100 0.0367
0.1081 150 0.0391
0.1442 200 0.0382
0.1802 250 0.0396
0.2163 300 0.0392
0.2523 350 0.0335
0.2884 400 0.0337
0.3244 450 0.0346
0.3605 500 0.0268
0.3965 550 0.0271
0.4326 600 0.0267
0.4686 650 0.029
0.5047 700 0.0269
0.5407 750 0.0221
0.5768 800 0.0252
0.6128 850 0.0229
0.6489 900 0.0235
0.6849 950 0.02
0.7210 1000 0.0198
0.7570 1050 0.0218
0.7931 1100 0.0219
0.8291 1150 0.0164
0.8652 1200 0.0165
0.9012 1250 0.0162
0.9373 1300 0.016
0.9733 1350 0.015
1.0094 1400 0.0143
1.0454 1450 0.0145
1.0815 1500 0.0136
1.1175 1550 0.0139
1.1536 1600 0.0122
1.1896 1650 0.0113
1.2257 1700 0.0125
1.2617 1750 0.0112
1.2978 1800 0.0111
1.3338 1850 0.0099
1.3699 1900 0.0103
1.4059 1950 0.0089
1.4420 2000 0.0087
1.4780 2050 0.0084
1.5141 2100 0.0082
1.5501 2150 0.0096
1.5862 2200 0.0082
1.6222 2250 0.0086
1.6583 2300 0.0083
1.6943 2350 0.0087
1.7304 2400 0.0071
1.7664 2450 0.0073
1.8025 2500 0.0092
1.8385 2550 0.0087
1.8745 2600 0.0077
1.9106 2650 0.0078
1.9466 2700 0.0059
1.9827 2750 0.0065
2.0187 2800 0.0067
2.0548 2850 0.0047
2.0908 2900 0.0055
2.1269 2950 0.0056
2.1629 3000 0.0051
2.1990 3050 0.0047
2.2350 3100 0.0054
2.2711 3150 0.0052
2.3071 3200 0.0051
2.3432 3250 0.0049
2.3792 3300 0.0046
2.4153 3350 0.0056
2.4513 3400 0.005
2.4874 3450 0.0045
2.5234 3500 0.0052
2.5595 3550 0.0056
2.5955 3600 0.005
2.6316 3650 0.005
2.6676 3700 0.0045
2.7037 3750 0.004
2.7397 3800 0.0055
2.7758 3850 0.0046
2.8118 3900 0.0039
2.8479 3950 0.0045
2.8839 4000 0.0048
2.9200 4050 0.0045
2.9560 4100 0.0053
2.9921 4150 0.0036
3.0281 4200 0.0042
3.0642 4250 0.0041
3.1002 4300 0.0034
3.1363 4350 0.0038
3.1723 4400 0.0029
3.2084 4450 0.0042
3.2444 4500 0.0035
3.2805 4550 0.0033
3.3165 4600 0.0031
3.3526 4650 0.0037
3.3886 4700 0.0032
3.4247 4750 0.0038
3.4607 4800 0.004
3.4968 4850 0.0042
3.5328 4900 0.003
3.5689 4950 0.004
3.6049 5000 0.0035
3.6410 5050 0.0028
3.6770 5100 0.003
3.7130 5150 0.0032
3.7491 5200 0.0029
3.7851 5250 0.0033
3.8212 5300 0.0036
3.8572 5350 0.0034
3.8933 5400 0.0038
3.9293 5450 0.003
3.9654 5500 0.0034

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.7.0
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
6
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support