SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v0_9_8")
# Run inference
sentences = [
    '科目:ユニット及びその他。名称:#Fスタッフステーションカウンター。',
    '科目:ユニット及びその他。名称:誘導サイン(自立)。',
    '科目:ユニット及びその他。名称:デジタルサイネージ。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,301 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 17.76 tokens
    • max: 32 tokens
    • 0: ~0.10%
    • 1: ~0.20%
    • 2: ~0.10%
    • 3: ~0.10%
    • 4: ~0.20%
    • 5: ~0.10%
    • 6: ~0.10%
    • 7: ~0.10%
    • 8: ~0.20%
    • 9: ~0.10%
    • 10: ~0.10%
    • 11: ~0.40%
    • 12: ~0.10%
    • 13: ~0.10%
    • 14: ~0.10%
    • 15: ~0.10%
    • 16: ~0.10%
    • 17: ~0.10%
    • 18: ~0.50%
    • 19: ~0.20%
    • 20: ~0.20%
    • 21: ~0.10%
    • 22: ~0.10%
    • 23: ~0.10%
    • 24: ~0.30%
    • 25: ~0.10%
    • 26: ~0.20%
    • 27: ~0.20%
    • 28: ~0.20%
    • 29: ~0.20%
    • 30: ~0.10%
    • 31: ~0.10%
    • 32: ~0.20%
    • 33: ~0.20%
    • 34: ~0.10%
    • 35: ~0.20%
    • 36: ~0.20%
    • 37: ~0.20%
    • 38: ~0.20%
    • 39: ~0.20%
    • 40: ~0.40%
    • 41: ~0.20%
    • 42: ~0.20%
    • 43: ~0.20%
    • 44: ~0.60%
    • 45: ~0.70%
    • 46: ~0.20%
    • 47: ~0.20%
    • 48: ~0.10%
    • 49: ~0.20%
    • 50: ~0.10%
    • 51: ~0.20%
    • 52: ~0.10%
    • 53: ~0.10%
    • 54: ~0.20%
    • 55: ~0.20%
    • 56: ~0.30%
    • 57: ~0.80%
    • 58: ~0.30%
    • 59: ~0.10%
    • 60: ~0.70%
    • 61: ~0.30%
    • 62: ~0.20%
    • 63: ~0.20%
    • 64: ~0.50%
    • 65: ~0.10%
    • 66: ~0.20%
    • 67: ~0.20%
    • 68: ~0.20%
    • 69: ~0.30%
    • 70: ~0.30%
    • 71: ~0.20%
    • 72: ~0.20%
    • 73: ~0.20%
    • 74: ~0.20%
    • 75: ~0.20%
    • 76: ~0.10%
    • 77: ~0.20%
    • 78: ~0.30%
    • 79: ~0.20%
    • 80: ~0.20%
    • 81: ~0.10%
    • 82: ~0.20%
    • 83: ~0.50%
    • 84: ~0.30%
    • 85: ~0.60%
    • 86: ~0.20%
    • 87: ~0.30%
    • 88: ~0.20%
    • 89: ~0.20%
    • 90: ~0.20%
    • 91: ~0.20%
    • 92: ~1.00%
    • 93: ~1.70%
    • 94: ~3.70%
    • 95: ~0.50%
    • 96: ~0.20%
    • 97: ~0.20%
    • 98: ~0.80%
    • 99: ~0.20%
    • 100: ~0.20%
    • 101: ~0.20%
    • 102: ~0.20%
    • 103: ~0.30%
    • 104: ~1.20%
    • 105: ~0.20%
    • 106: ~0.20%
    • 107: ~0.40%
    • 108: ~0.30%
    • 109: ~0.20%
    • 110: ~0.20%
    • 111: ~0.20%
    • 112: ~0.30%
    • 113: ~0.20%
    • 114: ~0.20%
    • 115: ~0.10%
    • 116: ~0.30%
    • 117: ~0.40%
    • 118: ~0.20%
    • 119: ~0.20%
    • 120: ~0.20%
    • 121: ~0.20%
    • 122: ~0.30%
    • 123: ~0.20%
    • 124: ~0.20%
    • 125: ~0.20%
    • 126: ~0.10%
    • 127: ~0.20%
    • 128: ~0.10%
    • 129: ~0.30%
    • 130: ~0.20%
    • 131: ~0.20%
    • 132: ~0.10%
    • 133: ~0.50%
    • 134: ~0.20%
    • 135: ~0.20%
    • 136: ~0.20%
    • 137: ~0.20%
    • 138: ~0.20%
    • 139: ~0.10%
    • 140: ~0.10%
    • 141: ~0.40%
    • 142: ~0.70%
    • 143: ~0.20%
    • 144: ~3.10%
    • 145: ~0.20%
    • 146: ~2.30%
    • 147: ~0.30%
    • 148: ~0.30%
    • 149: ~0.50%
    • 150: ~0.50%
    • 151: ~0.50%
    • 152: ~0.20%
    • 153: ~0.20%
    • 154: ~0.20%
    • 155: ~0.20%
    • 156: ~0.30%
    • 157: ~0.30%
    • 158: ~0.30%
    • 159: ~0.20%
    • 160: ~0.30%
    • 161: ~0.20%
    • 162: ~0.20%
    • 163: ~0.10%
    • 164: ~0.20%
    • 165: ~0.20%
    • 166: ~0.30%
    • 167: ~0.20%
    • 168: ~0.20%
    • 169: ~0.20%
    • 170: ~0.20%
    • 171: ~0.20%
    • 172: ~0.20%
    • 173: ~0.20%
    • 174: ~0.30%
    • 175: ~0.30%
    • 176: ~0.20%
    • 177: ~0.20%
    • 178: ~0.20%
    • 179: ~0.20%
    • 180: ~0.30%
    • 181: ~0.60%
    • 182: ~0.20%
    • 183: ~0.20%
    • 184: ~0.20%
    • 185: ~0.20%
    • 186: ~0.20%
    • 187: ~0.70%
    • 188: ~0.20%
    • 189: ~0.20%
    • 190: ~0.30%
    • 191: ~0.20%
    • 192: ~1.30%
    • 193: ~0.20%
    • 194: ~0.30%
    • 195: ~0.30%
    • 196: ~0.20%
    • 197: ~0.30%
    • 198: ~0.10%
    • 199: ~1.10%
    • 200: ~0.20%
    • 201: ~0.20%
    • 202: ~0.20%
    • 203: ~0.10%
    • 204: ~0.10%
    • 205: ~0.20%
    • 206: ~0.20%
    • 207: ~0.10%
    • 208: ~1.10%
    • 209: ~0.40%
    • 210: ~0.10%
    • 211: ~0.20%
    • 212: ~0.20%
    • 213: ~0.10%
    • 214: ~1.00%
    • 215: ~0.20%
    • 216: ~0.30%
    • 217: ~0.10%
    • 218: ~1.80%
    • 219: ~0.30%
    • 220: ~0.50%
    • 221: ~0.20%
    • 222: ~0.20%
    • 223: ~0.10%
    • 224: ~0.20%
    • 225: ~0.10%
    • 226: ~0.20%
    • 227: ~0.20%
    • 228: ~0.10%
    • 229: ~0.30%
    • 230: ~4.00%
    • 231: ~0.20%
    • 232: ~0.20%
    • 233: ~0.10%
    • 234: ~0.60%
    • 235: ~0.20%
    • 236: ~0.30%
    • 237: ~0.70%
    • 238: ~0.20%
    • 239: ~0.30%
    • 240: ~0.30%
    • 241: ~0.40%
    • 242: ~0.30%
    • 243: ~0.10%
    • 244: ~0.20%
    • 245: ~0.30%
    • 246: ~0.20%
    • 247: ~0.10%
    • 248: ~0.10%
    • 249: ~0.30%
    • 250: ~0.30%
    • 251: ~0.30%
    • 252: ~0.60%
    • 253: ~0.20%
    • 254: ~0.20%
    • 255: ~0.20%
    • 256: ~0.30%
    • 257: ~0.20%
    • 258: ~2.20%
    • 259: ~0.30%
    • 260: ~0.20%
    • 261: ~0.20%
    • 262: ~0.30%
    • 263: ~0.10%
    • 264: ~0.10%
    • 265: ~0.50%
    • 266: ~0.10%
    • 267: ~0.10%
    • 268: ~0.10%
    • 269: ~0.10%
    • 270: ~0.20%
    • 271: ~0.90%
    • 272: ~0.20%
    • 273: ~0.20%
    • 274: ~0.10%
    • 275: ~0.40%
    • 276: ~0.20%
    • 277: ~0.20%
    • 278: ~0.10%
    • 279: ~0.10%
    • 280: ~0.20%
    • 281: ~0.10%
    • 282: ~0.20%
    • 283: ~2.90%
    • 284: ~0.20%
    • 285: ~0.20%
    • 286: ~0.30%
    • 287: ~0.20%
    • 288: ~0.20%
    • 289: ~0.80%
    • 290: ~0.20%
    • 291: ~0.20%
    • 292: ~3.90%
    • 293: ~0.30%
    • 294: ~0.10%
    • 295: ~0.20%
    • 296: ~0.70%
    • 297: ~0.40%
    • 298: ~0.20%
    • 299: ~0.20%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:コンクリートポンプ圧送。 1
    科目:コンクリート。名称:ポンプ圧送。 1
  • Loss: BatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 200
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 200
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
3.6471 50 0.5866
7.5294 100 0.4693
11.4118 150 0.4486
15.2941 200 0.2783
19.1765 250 0.2732
23.0588 300 0.3268
26.7059 350 0.3403
30.5882 400 0.1967
34.4706 450 0.2025
38.3529 500 0.2108
42.2353 550 0.1458
46.1176 600 0.1914
49.7647 650 0.1065
53.6471 700 0.0607
57.5294 750 0.128
61.4118 800 0.0579
65.2941 850 0.1695
69.1765 900 0.1121
73.0588 950 0.1096
76.7059 1000 0.1213
80.5882 1050 0.0485
84.4706 1100 0.0759
88.3529 1150 0.0673
92.2353 1200 0.111
96.1176 1250 0.0159
99.7647 1300 0.1044
103.6471 1350 0.0928
107.5294 1400 0.0712
111.4118 1450 0.096
115.2941 1500 0.0648
119.1765 1550 0.0534
123.0588 1600 0.0071
126.7059 1650 0.0688
130.5882 1700 0.105
134.4706 1750 0.0344
138.3529 1800 0.0543
142.2353 1850 0.0072
146.1176 1900 0.0218
149.7647 1950 0.0203
153.6471 2000 0.0837
157.5294 2050 0.0423
161.4118 2100 0.0457
165.2941 2150 0.0591
169.1765 2200 0.0168
173.0588 2250 0.0234
176.7059 2300 0.0452
180.5882 2350 0.031
184.4706 2400 0.0241
188.3529 2450 0.0001
192.2353 2500 0.0427
196.1176 2550 0.0381
199.7647 2600 0.0203

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

BatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
4
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support