SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Alessio-Borgi/all-mpnet-base-v2-margin-based-triplet-loss-finetuned-culture-15-epochs-enhanced")
# Run inference
sentences = [
    "Intendant of Montevideo head of government of Montevideo The Intendant of Montevideo is head of the executive branch of the government of Montevideo. The Intendant serves a five-year term and is limited to two successive terms. According to the Constitution, the officeholder is elected in a direct election, which takes place on a date different from that of presidential elections. {'post': 'Intendant', 'body': 'Montevideo', 'insignia': 'Coat of arms of Montevideo Department.svg', 'insigniasize': '100px', 'native_name': '{{langx|es|Intendente de Montevideo}}', 'incumbent': 'Mauricio Zunino', 'incumbentsince': '8 July 2024', 'style': 'No courtesy, title or style', 'appointer': 'Direct election', 'termlength': '5 years (renewable)', 'formation': '19 January 1909', 'inaugural': 'Daniel Muñoz', 'website': '{{url|http://www.montevideo.gub.uy//}}', 'seat': 'City Hall of Montevideo'} {'instance of': 'human', 'sex or gender': 'male', 'occupation': 'politician', 'subclass of': 'politician', 'languages spoken, written or signed': 'English', 'different from': 'John Major', 'position held': 'member of the Chamber of Representatives of Belgium'}",
    "Bernard Archard British actor (1916-2008) Bernard Joseph Archard (20 August 1916 – 1 May 2008) was an English actor who made many film and television appearances. {'caption': 'Archard in 1962', 'name': 'Bernard Archard', 'birth_date': '{{Birth date|df|=|y|1916|08|20}}', 'birth_place': 'Fulham, London, England', 'death_date': '{{Death date and age|df|=|y|2008|5|1|1916|8|20}}', 'death_place': 'Witham Friary, Somerset, England', 'occupation': 'Actor', 'yearsactive': '1939–1994', 'domesticpartner': 'James Belchamber', 'aliases': ['Bernard Joseph Archard']} {'instance of': 'human', 'occupation': 'theatrical director', 'sex or gender': 'male', 'languages spoken, written or signed': 'Polish', 'country of citizenship': 'Poland'}",
    'department of corrections a type of government agency, found in many jurisdictions, responsible for managing prisons and parole/probation services In criminal justice, particularly in North America, correction, corrections, and correctional, are umbrella terms describing a variety of functions typically carried out by government agencies, and involving the punishment, treatment, and supervision of persons who have been convicted of crimes. These functions commonly include imprisonment, parole, and probation. A typical correctional institution is a prison. A correctional system, also known as a penal system, thus refers to a network of agencies that administer a jurisdiction\'s prisons, and community-based programs like parole, and probation boards. This system is part of the larger criminal justice system, which additionally includes police, prosecution and courts. Jurisdictions throughout Canada and the US have ministries or departments, respectively, of corrections, correctional services, or similarly-named agencies. "Corrections" is also the name of a field of academic study concerned with the theories, policies, and programs pertaining to the practice of corrections. Its object of study includes personnel training and management as well as the experiences of those on the other side of the fence — the unwilling subjects of the correctional process. Stohr and colleagues (2008) write that "Earlier scholars were more honest, calling what we now call corrections by the name penology, which means the study of punishment for crime." {\'aliases\': [\'correctional agency\', \'correctional service\']} {\'subclass of\': \'government agency\', \'described by source\': \'Small Brockhaus and Efron Encyclopedic Dictionary\', \'on focus list of Wikimedia project\': \'Wikipedia:Vital articles/Level/4\', \'different from\': \'Tribunal\'}',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 6,551 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 21 tokens
    • mean: 297.94 tokens
    • max: 384 tokens
    • min: 21 tokens
    • mean: 302.7 tokens
    • max: 384 tokens
    • min: 21 tokens
    • mean: 297.52 tokens
    • max: 384 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Hollywood Undead American rap rock band Hollywood Undead is an American rap rock band from Los Angeles, California, formed in 2005. All of the band members use pseudonyms and previously wore their own unique mask, most of which were based on the common hockey goaltender design. The band currently consists of five members: J-Dog, Funny Man, Johnny 3 Tears, Charlie Scene, and Danny. They released their debut album, Swan Songs, on September 2, 2008, and their live CD/DVD Desperate Measures, on November 10, 2009. Their second studio album, American Tragedy, was released April 5, 2011. Their third studio album, titled Notes from the Underground, was released on January 8, 2013. Their fourth studio album, Day of the Dead, was released on March 31, 2015. Hollywood Undead's fifth record is titled Five (or V), and was released on October 27, 2017. The first single from the album, called "California Dreaming", was made available July 24, 2017. Their sixth studio album, New Empire, Vol. 1, was re... Oskar Fischinger German-American abstract animator, filmmaker, and painter (1900-1967) Oskar Wilhelm Fischinger (June 22, 1900 – January 31, 1967) was a German-American abstract animator, filmmaker, and painter, notable for creating abstract musical animation many decades before the appearance of computer graphics and music videos. He created special effects for Fritz Lang's 1929 Woman in the Moon, one of the first sci-fi rocket films, and influenced Disney's Fantasia. He made over 50 short films and painted around 800 canvases, many of which are in museums, galleries, and collections worldwide. Among his film works is Motion Painting No. 1 (1947), which is now listed on the National Film Registry of the U.S. Library of Congress. {'name': 'Oskar Fischinger', 'birth_name': 'Oskar Wilhelm Fischinger', 'birth_date': '{{Birth date 1900
    social media and television Emerging platforms {} {'subclass of': 'television'} public historian historian who fosters the general public's engagement with history Public history is a broad range of activities undertaken by people with some training in the discipline of history who are generally working outside of specialized academic settings. Public history practice is deeply rooted in the areas of historic preservation, archival science, oral history, museum curatorship, and other related fields. The field has become increasingly professionalized in the United States and Canada since the late 1970s. Some of the most common settings for the practice of public history are museums, historic homes and historic sites, parks, battlefields, archives, film and television companies, new media, and all levels of government. {'note': 'infobox not present in Wikipedia'} {'employer': 'University of Amsterdam', 'place of death': 'Luxembourg', 'instance of': 'profession', 'subclass of': 'historian'} Yves Béhar Swiss designer, entrepreneur, and sustainability advocate Yves Béhar (IPA: [iv be.aʁ]; born 9 May 1967) is a Swiss-born American designer, entrepreneur, and educator. He is the founder and principal designer of Fuseproject, an industrial design and brand development firm. Béhar is also the co-founder and Chief Creative Officer of August Smart Lock, a smart lock company acquired by Assa Abloy in 2017; and co-founder of Canopy, a co-working space based in San Francisco. In 2011, the Conde Nast Innovation and Design Awards recognized Béhar as Designer of the Year. His clients have included Herman Miller, Movado, PUMA, Kodak, MINI, Western Digital, See Better to Learn Better, General Electric, Swarovski, Samsung, SNOO'S Happiest Baby Smart Bassinet, Jimmyjane, Prada and Cobalt Robotics. {'name': 'Yves Béhar', 'caption': 'Béhar in 2010', 'other_names': 'Yves Behar', 'birth_date': '{{birth date and age
    Asterix and Cleopatra 1968 Belgian/French animated film Asterix and Cleopatra (French: Astérix et Cléopâtre) is a 1968 Belgian–French animated comedy film; it is the second Asterix adventure to be made into a feature film. Overseen by Asterix creators Goscinny and Uderzo (who had no involvement in the production of the first film, Asterix the Gaul and in their director debuts), the film is noticeably more well-produced than its predecessor, featuring far more detailed animation and a more polished soundtrack. Asterix and Cleopatra is practically a musical, featuring three individual song sequences and a more varied score than the earlier film. Elements of satire and surreal humour (such as Cleopatra's singing lion and the engraving of Santa Claus on the pyramid wall) are prominent throughout. {'name': 'Asterix and Cleopatra', 'native_name': 'Astérix et Cléopâtre', 'caption': 'Original release poster', 'director': 'René Goscinny
    Albert Uderzo', 'producer': 'Raymond Leblanc', 'scree...
    huevos rancheros Mexican breakfast dish Huevos rancheros (Spanish pronunciation: [ˈweβos ranˈtʃeɾos], 'ranch-style eggs') is a breakfast egg dish served in the style of the traditional large mid-morning fare on rural Mexican farms. {'name': 'Huevos rancheros', 'country': 'Mexico', 'course': 'Breakfast', 'main_ingredient': 'Tortillas, eggs, salsa, refried beans, avocado or guacamole'} {'instance of': 'dish', 'subclass of': 'dish', 'course': 'main course', 'has part(s)': 'meat', 'country of origin': 'Mexico', 'made from material': 'garlic'} Project Hummerschere proposed project to expand Heligoland Project Hummerschere (English: Project Lobster Claw) was a construction project proposed by Nazi Germany's Kriegsmarine to expand the naval facilities on the island of Heligoland in the years leading up to World War II. Intended to create a large naval installation for operations in the North Sea, the plan involved expanding the island to its pre-1629 dimensions, restoring large areas which had been eroded by the sea. Construction was planned to more than double the usable area of both islands of Heligoland, allowing for the creation of a major naval base and Luftwaffe installation. Conceived by German Admiral Erich Raeder, the extensive project began in 1937. It was enthusiastically endorsed by Adolf Hitler, who personally inspected the construction in August 1938. After the onset of World War II the island became vulnerable to British air raids, and the project was abandoned as a result. == References == {'note': 'infobox not...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 15
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.6105 500 0.267
1.2210 1000 0.1538
1.8315 1500 0.0981
2.4420 2000 0.0675
3.0525 2500 0.0494
3.6630 3000 0.0302
4.2735 3500 0.0216
4.8840 4000 0.0141
5.4945 4500 0.0137
6.1050 5000 0.0112
6.7155 5500 0.0073
7.3260 6000 0.0059
7.9365 6500 0.0055
8.5470 7000 0.0041
9.1575 7500 0.0034
9.7680 8000 0.0015
10.3785 8500 0.002
10.9890 9000 0.0012
11.5995 9500 0.0008
12.2100 10000 0.0008
12.8205 10500 0.0004
13.4310 11000 0.0002
14.0415 11500 0.0002
14.6520 12000 0.0001

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
7
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alessio-Borgi/all-mpnet-base-v2-margin-based-triplet-loss-finetuned-culture-15-epochs-enhanced

Finetuned
(281)
this model