SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-m3
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    '빙판길 주행 시 차량의 미끄러짐을 방지하기 위한 타이어 관련 조치는 무엇인가요?',
    '눈길, 빙판길 주행 시 주의 사항 • 가속 페달을 서서히 밟아 부드럽게 출발하십시오. • 타이어 체인이나 스노우 타이어를 사용하십시오. • 차간거리를 충분히 유지하고, 브레이크 사용 시는 엔진 브레이크를 사용하십시오. • 주행 중 급가속, 급제동, 과도한 스티어링 휠 조작을 하지 마십시오. 차량이 미끄러질 수 있어 위험합니다',
    '주행 중 브레이크 경고등이 켜지는 경우: 브레이크 계통에 이상이 있다는 신호이므로 아래와 같은 방법으로 안전한 장소에 차를 정지시키십시오. 1. 브레이크 페달을 밟아도 제동이 되지 않으면 평상시보다 페달을 강하게 밟아보십시오. 2. 엔진 브레이크를 걸어 속도를 늦추고 파킹 브레이크를 작동시키면서 브레이크 페달을 밟으십시오. 주행 중 경고등이 켜진 상태에서는 급제동하지 말고 서서히 정지하십시오.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.0314
cosine_accuracy@3	0.7397
cosine_accuracy@5	0.8645
cosine_accuracy@10	0.9438
cosine_precision@1	0.0314
cosine_precision@3	0.2466
cosine_precision@5	0.1729
cosine_precision@10	0.0944
cosine_recall@1	0.0314
cosine_recall@3	0.7397
cosine_recall@5	0.8645
cosine_recall@10	0.9438
cosine_ndcg@10	0.5574
cosine_mrr@10	0.4268
cosine_map@100	0.4295

Training Details

Training Dataset

Unnamed Dataset

Size: 1,634 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 10 tokens
mean: 23.97 tokens
max: 47 tokens

min: 14 tokens
mean: 209.79 tokens
max: 2889 tokens

	sentence_0	sentence_1
type	string	string
details	min: 10 tokens mean: 23.97 tokens max: 47 tokens	min: 14 tokens mean: 209.79 tokens max: 2889 tokens

Samples:

sentence_0	sentence_1
`AUTO STOP 누적 시간을 초기화하려면 어떤 조작을 해야 하나요?`	`AUTO STOP 누적 시간: 공회전 제한 시스템에 의해 엔진이 꺼진 상태의 누적 시간을 표시합니다. 스티어링 휠의 OK 스위치를 길게 눌러 정보를 초기화할 수 있습니다.`
`주차 브레이크 레버를 당길 때 몇 kg의 힘으로 몇 번 "딸깍" 소리가 나야 정상적으로 작동하는 것입니까?`	주차 브레이크 점검 주차 브레이크 작동 상태 점검 정기 점검 주기에 따라 반드시 안전한 상태에서 주차 브레이크 점검을 실시하십시오. 핸드 주차 브레이크 • 전·후방에 차량이 없는 상태에서 주차 브레이크 레버를 당겨 가파른 언덕길에서 제동이 되는지 점검하십시오. • 평탄하고 안전한 장소에 주차시킨 후, 주차 브레이크가 완전히 해제된 상태에서 주차 브레이크 레버를 20kg의 힘으로 당겼을 때 5~7회 “딸깍” 거리는지 확인하십시오. 주의 주차 브레이크가 작동된 상태에서 주행하면 브레이크 패드의 과다한 마모의 원인이 됩니다.
`자동 변속기를 사용할 때 차량을 멈추기 위해 반드시 해야 하는 것은 무엇인가요?`	`자동 변속기 작동 변속 위치별 기능 자동 변속기는 변속 위치, 차량 속도, 가속 페달의 위치에 따라 1속~8속까지 자동으로 변속합니다 'D' 주행, 'R' 후진 및 수동 변속 모드 구간(+, -)에서는 브레이크 페달을 밟고 있지 않으면 가속 페달을 밟지 않아도 차량이 전진하거나 후진합니다 차량을 멈추려면 반드시 브레이크 페달을 밟으십시오`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 5
per_device_eval_batch_size: 5
num_train_epochs: 30
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 5
per_device_eval_batch_size: 5
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 30
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	Training Loss	cosine_ndcg@10
0.4587	50	-	0.5795
0.9174	100	-	0.5772
1.0	109	-	0.5782
1.3761	150	-	0.5838
1.8349	200	-	0.5801
2.0	218	-	0.5820
2.2936	250	-	0.5793
2.7523	300	-	0.5841
3.0	327	-	0.5892
3.2110	350	-	0.5850
3.6697	400	-	0.5847
4.0	436	-	0.5856
4.1284	450	-	0.5828
4.5872	500	0.0665	0.5835
5.0	545	-	0.5891
5.0459	550	-	0.5921
5.5046	600	-	0.5813
5.9633	650	-	0.5854
6.0	654	-	0.5891
6.4220	700	-	0.5787
6.8807	750	-	0.5790
7.0	763	-	0.5835
7.3394	800	-	0.5749
7.7982	850	-	0.5779
8.0	872	-	0.5804
8.2569	900	-	0.5784
8.7156	950	-	0.5720
9.0	981	-	0.5762
9.1743	1000	0.0241	0.5730
9.6330	1050	-	0.5804
10.0	1090	-	0.5802
10.0917	1100	-	0.5803
10.5505	1150	-	0.5734
11.0	1199	-	0.5769
11.0092	1200	-	0.5799
11.4679	1250	-	0.5807
11.9266	1300	-	0.5738
12.0	1308	-	0.5770
12.3853	1350	-	0.5744
12.8440	1400	-	0.5751
13.0	1417	-	0.5806
13.3028	1450	-	0.5708
13.7615	1500	0.0176	0.5744
14.0	1526	-	0.5745
14.2202	1550	-	0.5766
14.6789	1600	-	0.5726
15.0	1635	-	0.5778
15.1376	1650	-	0.5805
15.5963	1700	-	0.5673
16.0	1744	-	0.5823
16.0550	1750	-	0.5857
16.5138	1800	-	0.5710
16.9725	1850	-	0.5763
17.0	1853	-	0.5756
17.4312	1900	-	0.5802
17.8899	1950	-	0.5683
18.0	1962	-	0.5669
18.3486	2000	0.0145	0.5729
18.8073	2050	-	0.5690
19.0	2071	-	0.5736
19.2661	2100	-	0.5641
19.7248	2150	-	0.5729
20.0	2180	-	0.5709
20.1835	2200	-	0.5598
20.6422	2250	-	0.5666
21.0	2289	-	0.5712
21.1009	2300	-	0.5692
21.5596	2350	-	0.5684
22.0	2398	-	0.5760
22.0183	2400	-	0.5765
22.4771	2450	-	0.5578
22.9358	2500	0.0109	0.5676
23.0	2507	-	0.5663
23.3945	2550	-	0.5673
23.8532	2600	-	0.5670
24.0	2616	-	0.5662
24.3119	2650	-	0.5683
24.7706	2700	-	0.5724
25.0	2725	-	0.5676
25.2294	2750	-	0.5641
25.6881	2800	-	0.5671
26.0	2834	-	0.5615
26.1468	2850	-	0.5579
26.6055	2900	-	0.5626
27.0	2943	-	0.5576
27.0642	2950	-	0.5516
27.5229	3000	0.0088	0.5602
27.9817	3050	-	0.5654
28.0	3052	-	0.5631
28.4404	3100	-	0.5671
28.8991	3150	-	0.5668
29.0	3161	-	0.5640
29.3578	3200	-	0.5619
29.8165	3250	-	0.5593
30.0	3270	-	0.5574

Framework Versions

Python: 3.10.16
Sentence Transformers: 4.1.0
Transformers: 4.52.3
PyTorch: 2.7.0+cu126
Accelerate: 1.7.0
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

JLee0
/

rag-embedder-staria-30epochs