SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v1_0_8_5")
# Run inference
sentences = [
    '科目:タイル。名称:ドライエリア床タイル。',
    '科目:タイル。名称:屋外階段踊場タイル張り。',
    '科目:タイル。名称:段鼻タイル。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 356,381 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 11 tokens
    • mean: 13.78 tokens
    • max: 19 tokens
    • min: 11 tokens
    • mean: 14.8 tokens
    • max: 23 tokens
    • 0: ~74.10%
    • 1: ~2.60%
    • 2: ~23.30%
  • Samples:
    sentence1 sentence2 label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震BPL下部充填コンクリート打設手間。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震下部コンクリート打設手間。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 科目:コンクリート。名称:免震下部(外周基礎梁)コンクリート打設手間。 0
  • Loss: sentence_transformer_lib.categorical_constrastive_loss.CategoricalContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 4
  • warmup_ratio: 0.2
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0072 10 0.2157
0.0144 20 0.1965
0.0215 30 0.164
0.0287 40 0.1199
0.0359 50 0.0913
0.0431 60 0.0687
0.0503 70 0.0462
0.0574 80 0.0459
0.0646 90 0.0424
0.0718 100 0.0416
0.0790 110 0.0377
0.0861 120 0.0472
0.0933 130 0.0437
0.1005 140 0.0332
0.1077 150 0.0411
0.1149 160 0.0361
0.1220 170 0.037
0.1292 180 0.0325
0.1364 190 0.0386
0.1436 200 0.0398
0.1508 210 0.0415
0.1579 220 0.0327
0.1651 230 0.0425
0.1723 240 0.0437
0.1795 250 0.0365
0.1866 260 0.028
0.1938 270 0.0412
0.2010 280 0.0424
0.2082 290 0.0382
0.2154 300 0.0282
0.2225 310 0.0358
0.2297 320 0.0311
0.2369 330 0.0339
0.2441 340 0.0313
0.2513 350 0.0333
0.2584 360 0.0238
0.2656 370 0.0367
0.2728 380 0.0295
0.2800 390 0.0286
0.2872 400 0.0358
0.2943 410 0.0288
0.3015 420 0.032
0.3087 430 0.0323
0.3159 440 0.0284
0.3230 450 0.0297
0.3302 460 0.0266
0.3374 470 0.0317
0.3446 480 0.0298
0.3518 490 0.0272
0.3589 500 0.0307
0.3661 510 0.0337
0.3733 520 0.0268
0.3805 530 0.0286
0.3877 540 0.0283
0.3948 550 0.0293
0.4020 560 0.0299
0.4092 570 0.0231
0.4164 580 0.0308
0.4235 590 0.0294
0.4307 600 0.0309
0.4379 610 0.0255
0.4451 620 0.0269
0.4523 630 0.0226
0.4594 640 0.028
0.4666 650 0.027
0.4738 660 0.0365
0.4810 670 0.0264
0.4882 680 0.0212
0.4953 690 0.0311
0.5025 700 0.0266
0.5097 710 0.0203
0.5169 720 0.0207
0.5240 730 0.0348
0.5312 740 0.0227
0.5384 750 0.0237
0.5456 760 0.0201
0.5528 770 0.0257
0.5599 780 0.0266
0.5671 790 0.0276
0.5743 800 0.0271
0.5815 810 0.0238
0.5887 820 0.0217
0.5958 830 0.018
0.6030 840 0.0223
0.6102 850 0.0208
0.6174 860 0.0248
0.6246 870 0.0264
0.6317 880 0.0198
0.6389 890 0.0215
0.6461 900 0.0193
0.6533 910 0.0191
0.6604 920 0.0205
0.6676 930 0.0219
0.6748 940 0.0229
0.6820 950 0.0234
0.6892 960 0.0225
0.6963 970 0.0185
0.7035 980 0.0174
0.7107 990 0.0169
0.7179 1000 0.0218
0.7251 1010 0.0141
0.7322 1020 0.0221
0.7394 1030 0.0185
0.7466 1040 0.0219
0.7538 1050 0.0183
0.7609 1060 0.0153
0.7681 1070 0.0168
0.7753 1080 0.0177
0.7825 1090 0.0177
0.7897 1100 0.0179
0.7968 1110 0.0181
0.8040 1120 0.02
0.8112 1130 0.0186
0.8184 1140 0.0185
0.8256 1150 0.0162
0.8327 1160 0.0156
0.8399 1170 0.0141
0.8471 1180 0.0152
0.8543 1190 0.0146
0.8615 1200 0.018
0.8686 1210 0.0194
0.8758 1220 0.0148
0.8830 1230 0.0183
0.8902 1240 0.0124
0.8973 1250 0.0141
0.9045 1260 0.0193
0.9117 1270 0.0169
0.9189 1280 0.0165
0.9261 1290 0.0101
0.9332 1300 0.0195
0.9404 1310 0.0168
0.9476 1320 0.0207
0.9548 1330 0.018
0.9620 1340 0.0116
0.9691 1350 0.0175
0.9763 1360 0.0138
0.9835 1370 0.0209
0.9907 1380 0.0145
0.9978 1390 0.0138
1.0050 1400 0.0123
1.0122 1410 0.0145
1.0194 1420 0.0135
1.0266 1430 0.0115
1.0337 1440 0.014
1.0409 1450 0.0106
1.0481 1460 0.0102
1.0553 1470 0.0133
1.0625 1480 0.008
1.0696 1490 0.0134
1.0768 1500 0.0106
1.0840 1510 0.0151
1.0912 1520 0.0168
1.0983 1530 0.0093
1.1055 1540 0.0132
1.1127 1550 0.0115
1.1199 1560 0.0096
1.1271 1570 0.012
1.1342 1580 0.0119
1.1414 1590 0.0108
1.1486 1600 0.013
1.1558 1610 0.0109
1.1630 1620 0.0131
1.1701 1630 0.0093
1.1773 1640 0.0126
1.1845 1650 0.009
1.1917 1660 0.0106
1.1989 1670 0.0102
1.2060 1680 0.0089
1.2132 1690 0.0096
1.2204 1700 0.0084
1.2276 1710 0.0099
1.2347 1720 0.0074
1.2419 1730 0.0131
1.2491 1740 0.0125
1.2563 1750 0.0102
1.2635 1760 0.0117
1.2706 1770 0.0099
1.2778 1780 0.0078
1.2850 1790 0.0095
1.2922 1800 0.0079
1.2994 1810 0.0069
1.3065 1820 0.0121
1.3137 1830 0.0101
1.3209 1840 0.0151
1.3281 1850 0.0107
1.3352 1860 0.0125
1.3424 1870 0.0111
1.3496 1880 0.0091
1.3568 1890 0.0082
1.3640 1900 0.0092
1.3711 1910 0.0107
1.3783 1920 0.0066
1.3855 1930 0.0141
1.3927 1940 0.0126
1.3999 1950 0.009
1.4070 1960 0.0116
1.4142 1970 0.0121
1.4214 1980 0.0098
1.4286 1990 0.0108
1.4358 2000 0.0103
1.4429 2010 0.0118
1.4501 2020 0.0143
1.4573 2030 0.0082
1.4645 2040 0.0077
1.4716 2050 0.0102
1.4788 2060 0.0093
1.4860 2070 0.0084
1.4932 2080 0.0105
1.5004 2090 0.0091
1.5075 2100 0.0094
1.5147 2110 0.0092
1.5219 2120 0.0117
1.5291 2130 0.0085
1.5363 2140 0.0069
1.5434 2150 0.0114
1.5506 2160 0.0077
1.5578 2170 0.0092
1.5650 2180 0.0093
1.5721 2190 0.0076
1.5793 2200 0.0098
1.5865 2210 0.01
1.5937 2220 0.01
1.6009 2230 0.0092
1.6080 2240 0.0096
1.6152 2250 0.0077
1.6224 2260 0.0147
1.6296 2270 0.0087
1.6368 2280 0.0106
1.6439 2290 0.007
1.6511 2300 0.0091
1.6583 2310 0.0083
1.6655 2320 0.0113
1.6726 2330 0.0076
1.6798 2340 0.0096
1.6870 2350 0.0087
1.6942 2360 0.0068
1.7014 2370 0.0064
1.7085 2380 0.0088
1.7157 2390 0.0052
1.7229 2400 0.0088
1.7301 2410 0.0068
1.7373 2420 0.0072
1.7444 2430 0.0076
1.7516 2440 0.0078
1.7588 2450 0.0066
1.7660 2460 0.0086
1.7732 2470 0.0051
1.7803 2480 0.0115
1.7875 2490 0.0059
1.7947 2500 0.0088
1.8019 2510 0.0078
1.8090 2520 0.0057
1.8162 2530 0.0076
1.8234 2540 0.0077
1.8306 2550 0.009
1.8378 2560 0.0073
1.8449 2570 0.009
1.8521 2580 0.0094
1.8593 2590 0.0068
1.8665 2600 0.0081
1.8737 2610 0.004
1.8808 2620 0.0077
1.8880 2630 0.0072
1.8952 2640 0.0061
1.9024 2650 0.0077
1.9095 2660 0.0074
1.9167 2670 0.0077
1.9239 2680 0.0073
1.9311 2690 0.0096
1.9383 2700 0.006
1.9454 2710 0.0092
1.9526 2720 0.005
1.9598 2730 0.0045
1.9670 2740 0.0071
1.9742 2750 0.0061
1.9813 2760 0.0073
1.9885 2770 0.0073
1.9957 2780 0.0067
2.0029 2790 0.0054
2.0101 2800 0.0044
2.0172 2810 0.0045
2.0244 2820 0.005
2.0316 2830 0.0066
2.0388 2840 0.0038
2.0459 2850 0.0051
2.0531 2860 0.0039
2.0603 2870 0.0051
2.0675 2880 0.0056
2.0747 2890 0.0054
2.0818 2900 0.0069
2.0890 2910 0.006
2.0962 2920 0.0074
2.1034 2930 0.0067
2.1106 2940 0.0044
2.1177 2950 0.0065
2.1249 2960 0.0066
2.1321 2970 0.0044
2.1393 2980 0.0041
2.1464 2990 0.0066
2.1536 3000 0.0046
2.1608 3010 0.0061
2.1680 3020 0.0039
2.1752 3030 0.0048
2.1823 3040 0.0059
2.1895 3050 0.0067
2.1967 3060 0.005
2.2039 3070 0.0028
2.2111 3080 0.0055
2.2182 3090 0.0032
2.2254 3100 0.0074
2.2326 3110 0.0052
2.2398 3120 0.0058
2.2469 3130 0.0067
2.2541 3140 0.0065
2.2613 3150 0.0036
2.2685 3160 0.005
2.2757 3170 0.0083
2.2828 3180 0.0038
2.2900 3190 0.0044
2.2972 3200 0.0057
2.3044 3210 0.0042
2.3116 3220 0.0037
2.3187 3230 0.0061
2.3259 3240 0.0038
2.3331 3250 0.0051
2.3403 3260 0.0076
2.3475 3270 0.005
2.3546 3280 0.0042
2.3618 3290 0.005
2.3690 3300 0.0077
2.3762 3310 0.0067
2.3833 3320 0.008
2.3905 3330 0.0077
2.3977 3340 0.0052
2.4049 3350 0.0055
2.4121 3360 0.0059
2.4192 3370 0.0042
2.4264 3380 0.0044
2.4336 3390 0.0055
2.4408 3400 0.0048
2.4480 3410 0.0035
2.4551 3420 0.0068
2.4623 3430 0.007
2.4695 3440 0.0059
2.4767 3450 0.0037
2.4838 3460 0.0049
2.4910 3470 0.0042
2.4982 3480 0.004
2.5054 3490 0.0033
2.5126 3500 0.004
2.5197 3510 0.0055
2.5269 3520 0.0057
2.5341 3530 0.0059
2.5413 3540 0.0031
2.5485 3550 0.0039
2.5556 3560 0.0046
2.5628 3570 0.0035
2.5700 3580 0.0037
2.5772 3590 0.0045
2.5844 3600 0.006
2.5915 3610 0.0058
2.5987 3620 0.0053
2.6059 3630 0.0045
2.6131 3640 0.0031
2.6202 3650 0.0063
2.6274 3660 0.004
2.6346 3670 0.0043
2.6418 3680 0.0055
2.6490 3690 0.0044
2.6561 3700 0.0025
2.6633 3710 0.0047
2.6705 3720 0.0043
2.6777 3730 0.0041
2.6849 3740 0.0064
2.6920 3750 0.0055
2.6992 3760 0.0038
2.7064 3770 0.0059
2.7136 3780 0.0059
2.7207 3790 0.0039
2.7279 3800 0.0051
2.7351 3810 0.0061
2.7423 3820 0.0029
2.7495 3830 0.0043
2.7566 3840 0.0044
2.7638 3850 0.0047
2.7710 3860 0.0041
2.7782 3870 0.0033
2.7854 3880 0.0028
2.7925 3890 0.0049
2.7997 3900 0.0048
2.8069 3910 0.0042
2.8141 3920 0.0047
2.8212 3930 0.0043
2.8284 3940 0.0034
2.8356 3950 0.0034
2.8428 3960 0.0036
2.8500 3970 0.0057
2.8571 3980 0.0067
2.8643 3990 0.0053
2.8715 4000 0.0045
2.8787 4010 0.0044
2.8859 4020 0.0045
2.8930 4030 0.0028
2.9002 4040 0.0032
2.9074 4050 0.0054
2.9146 4060 0.005
2.9218 4070 0.0039
2.9289 4080 0.003
2.9361 4090 0.0036
2.9433 4100 0.003
2.9505 4110 0.0052
2.9576 4120 0.0029
2.9648 4130 0.0038
2.9720 4140 0.0048
2.9792 4150 0.0046
2.9864 4160 0.005
2.9935 4170 0.0047
3.0007 4180 0.0048
3.0079 4190 0.0033
3.0151 4200 0.0026
3.0223 4210 0.0031
3.0294 4220 0.0043
3.0366 4230 0.0034
3.0438 4240 0.0038
3.0510 4250 0.0023
3.0581 4260 0.0036
3.0653 4270 0.0045
3.0725 4280 0.0028
3.0797 4290 0.0025
3.0869 4300 0.0036
3.0940 4310 0.0055
3.1012 4320 0.0041
3.1084 4330 0.0027
3.1156 4340 0.0048
3.1228 4350 0.0049
3.1299 4360 0.0028
3.1371 4370 0.0052
3.1443 4380 0.0029
3.1515 4390 0.0039
3.1587 4400 0.0029
3.1658 4410 0.0045
3.1730 4420 0.0031
3.1802 4430 0.004
3.1874 4440 0.0042
3.1945 4450 0.0039
3.2017 4460 0.0027
3.2089 4470 0.0031
3.2161 4480 0.0043
3.2233 4490 0.0027
3.2304 4500 0.0035
3.2376 4510 0.0034
3.2448 4520 0.0039
3.2520 4530 0.0026
3.2592 4540 0.0035
3.2663 4550 0.0041
3.2735 4560 0.0021
3.2807 4570 0.0032
3.2879 4580 0.0032
3.2950 4590 0.0026
3.3022 4600 0.0045
3.3094 4610 0.0046
3.3166 4620 0.0014
3.3238 4630 0.0026
3.3309 4640 0.0026
3.3381 4650 0.002
3.3453 4660 0.0043
3.3525 4670 0.0051
3.3597 4680 0.0041
3.3668 4690 0.0021
3.3740 4700 0.0059
3.3812 4710 0.006
3.3884 4720 0.0049
3.3955 4730 0.0035
3.4027 4740 0.004
3.4099 4750 0.0039
3.4171 4760 0.0024
3.4243 4770 0.0026
3.4314 4780 0.0038
3.4386 4790 0.0029
3.4458 4800 0.0045
3.4530 4810 0.0025
3.4602 4820 0.0031
3.4673 4830 0.0044
3.4745 4840 0.0018
3.4817 4850 0.0035
3.4889 4860 0.0031
3.4961 4870 0.0058
3.5032 4880 0.0032
3.5104 4890 0.0028
3.5176 4900 0.0029
3.5248 4910 0.0038
3.5319 4920 0.0026
3.5391 4930 0.0028
3.5463 4940 0.0034
3.5535 4950 0.0044
3.5607 4960 0.003
3.5678 4970 0.0028
3.5750 4980 0.0031
3.5822 4990 0.003
3.5894 5000 0.0028

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.1
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
155
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Detomo/cl-nagoya-sup-simcse-ja-nss-v1_0_8_5 1