You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Thai Food Ingredients → Dish Prediction

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("thai-food-mpnet-tuned")
# Run inference
sentences = [
    'นมเปรี้ยว, เยลลี่รวมรสผลไม้',
    'ไอศกรีมโยเกิร์ตเยลลี่ปีโป้',
    'ทองหยอด',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.551
cosine_accuracy@3 0.7959
cosine_accuracy@5 0.898
cosine_accuracy@10 0.9592
cosine_precision@1 0.551
cosine_precision@3 0.2653
cosine_precision@5 0.1796
cosine_recall@1 0.551
cosine_recall@3 0.7959
cosine_recall@5 0.898
cosine_ndcg@10 0.7617
cosine_mrr@10 0.698
cosine_map@100 0.6998

Training Details

Training Dataset

Unnamed Dataset

  • Size: 400 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 400 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 44.22 tokens
    • max: 105 tokens
    • min: 4 tokens
    • mean: 9.92 tokens
    • max: 24 tokens
  • Samples:
    anchor positive
    ผักกระเฉด, กุ้งผ่า, ปลากระป๋อง, พริกสด, หอมแดง, กระชาย, กระปิ, เกลือป่น, น้ำตาล, น้ำปลา, น้ำมะขามเปียก, มะนาว แกงส้มผักกระเฉด
    หมูสามชั้น, น้ำส้มสายชู, เกลือ, น้ำมันงา, กระเทียม, หอมแดง, รากผักชี, เต้าเจี้ยว, ซอสหอยนางรม, น้ำตาล, ซีอิ๊วดำ, น้ำซุป, นมสด, แป้งมัน, งาขาว ข้าวหมูกรอบ
    ซอสถั่วเหลืองจิ้ม, ไข่ไก่, เบคอน, ไส้กรอก, แฮม, ชาร้อน, น้ำสะอาด ชุดอาหารเช้า
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 49 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 49 samples:
    anchor positive
    type string string
    details
    • min: 8 tokens
    • mean: 43.39 tokens
    • max: 105 tokens
    • min: 5 tokens
    • mean: 9.39 tokens
    • max: 18 tokens
  • Samples:
    anchor positive
    เห็ด, กะปิ, น้ำปลา, ตะไคร้, หอมแดง, พริกขี้หนู, พริกแดง, ผักหวาน, ชะอม, ใบแมงลัก, ใบย่านาง, น้ำเปล่า, หน่อไม้ แกงเห็ดผักหวานใส่กะปิ
    หมูสามชั้น, พริกไทย, กระเทียม, รากผักชี, อบเชย, ดอกจันทร์, ซีอิ้วขาว, ซีอิ้วดำ, น้ำตาลทราย, น้ำตาลปี๊บ, เกลือ, คนอร์ หมูฮ้อง
    สะโพกไก่, เกลือ, พริกไทย, มันฝรั่ง, แครอท, แป้งสาลี, น้ำมัน, เนย, กระเทียม, หอม, ผงปรุงรส, ซอสมะเขือเทศ สตูไก่ สูตรไม่ใส่เครื่องเทศ
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • learning_rate: 5e-06
  • num_train_epochs: 16
  • warmup_ratio: 0.1
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 16
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss thai-food-eval_cosine_ndcg@10
0.5882 10 3.2903 - -
1.0 17 - 2.3505 0.4150
1.1765 20 2.467 - -
1.7647 30 1.9808 - -
2.0 34 - 1.5080 0.6269
2.3529 40 1.6699 - -
2.9412 50 1.3974 - -
3.0 51 - 1.2515 0.6827
3.5294 60 1.1846 - -
4.0 68 - 1.1762 0.6735
4.1176 70 1.1183 - -
4.7059 80 1.001 - -
5.0 85 - 1.1104 0.6995
5.2941 90 0.9919 - -
5.8824 100 0.8285 - -
6.0 102 - 1.0806 0.7152
6.4706 110 0.7873 - -
7.0 119 - 1.0573 0.7333
7.0588 120 0.7359 - -
7.6471 130 0.6526 - -
8.0 136 - 1.0051 0.7566
8.2353 140 0.6004 - -
8.8235 150 0.571 - -
9.0 153 - 0.9988 0.7700
9.4118 160 0.5254 - -
10.0 170 0.5119 1.0087 0.7590
10.5882 180 0.492 - -
11.0 187 - 0.9756 0.77
11.1765 190 0.5334 - -
11.7647 200 0.4395 - -
12.0 204 - 0.9851 0.7707
12.3529 210 0.4362 - -
12.9412 220 0.3905 - -
13.0 221 - 1.0020 0.7578
13.5294 230 0.4388 - -
14.0 238 - 0.9994 0.7563
14.1176 240 0.4594 - -
14.7059 250 0.4502 - -
15.0 255 - 1.0004 0.7631
15.2941 260 0.3539 - -
15.8824 270 0.4144 - -
16.0 272 - 0.9997 0.7617
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.7.0
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
65
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chanisorn/thai-food-mpnet-tuned

Evaluation results