SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Alessio-Borgi/all-mpnet-base-v2-margin-based-triplet-loss-finetuned-culture-20-epochs-enhanced")
# Run inference
sentences = [
    'FIBA Saporta Cup international men\'s basketball club tournament in Europe between the seasons of 1966/1967-2001/2002 The FIBA Saporta Cup, founded as FIBA European Cup Winners Cup, was the name of the second-tier level European-wide professional club basketball competition, where the domestic National Cup winners, from all over Europe, played against each other. The competition was organized by FIBA Europe. It was named after the late Raimundo Saporta, a former Real Madrid director. {\'title\': \'FIBA Saporta Cup\', \'logo\': \'Copa Saporta.png\', \'pixels\': \'100px\', \'caption\': "The FIBA Saporta Cup\'s championship trophy", \'organiser\': \'FIBA Europe\', \'founded\': \'{{Start date and years ago|df|=|yes|1966}}\', \'first\': "\'\'\'FIBA European Cup Winners Cup\'\'\'<br />1966–67<br />\'\'\'FIBA European Cup\'\'\'<br />1991–92<br />\'\'\'FIBA EuroCup\'\'\'<br />1996–97<br />\'\'\'FIBA Saporta Cup\'\'\'<br />1998–99", \'folded\': \'{{Start date and years ago|df|=|yes|2002}}\', \'region\': \'Europe\', \'level\': \'2\', \'pyramid\': \'European professional club basketball system\', \'champions\': \'{{flagicon|ITA}} Montepaschi Siena (1st title)\', \'season\': \'2001–02\', \'most_champs\': \'{{flagicon|ESP}} Real Madrid<br/> {{flagicon|Italy}} Cantù<br/>(4 titles each)\'} {\'subclass of\': \'recurring sporting event\', \'instance of\': \'recurring sporting event\', \'organizer\': \'International Olympic Committee\', \'event interval\': \'{"amount": "+4", "unit": "http://www.wikidata.org/entity/Q577"}\', \'part of\': \'Summer Olympic Games\'}',
    "figure skate type of ice skate used by figure skaters Figure skates are a type of ice skate used by figure skaters. The skates consist of a boot and a blade that is attached with screws to the sole of the boot. Inexpensive sets for recreational skaters are available, but most figure skaters purchase boots and blades separately and have the blades mounted by a professional skate technician. {'note': 'infobox not present in Wikipedia'} {'subclass of': 'sports equipment', 'sport': 'tennis', 'instance of': 'sports equipment', 'on focus list of Wikimedia project': 'Wikipedia:Vital articles/Level/4', 'described by source': 'Brockhaus and Efron Encyclopedic Dictionary'}",
    'Eddie Murphy American actor and comedian Edward Regan Murphy (born April 3, 1961) is an American actor, comedian, and singer. He had his breakthrough as a standup comic before gaining stardom for his film roles; he is widely recognized as one of the greatest comedians of all time. He has received several accolades including a Golden Globe Award, a Grammy Award, and an Emmy Award as well as nominations for an Academy Award and a BAFTA Award. He was honored with the Mark Twain Prize for American Humor in 2015 and the Cecil B. DeMille Award in 2023. Murphy shot to fame on the sketch comedy show Saturday Night Live, for which he was a regular cast member from 1980 to 1984 and broke out as a movie star in the 1980s films 48 Hrs., Trading Places, and Beverly Hills Cop. He then established himself as a leading man with starring roles in: The Golden Child (1986), Coming to America (1988), Harlem Nights (which he also directed) (1989), Boomerang (1992), The Nutty Professor (1996), Dr. Dolittle (1997), Bowfinger (1999), Daddy Day Care (2003), and Norbit (2007). Murphy both won the Golden Globe for Best Supporting Actor and received a nomination for the Academy Award for Best Supporting Actor for his role in Dreamgirls (2006). Murphy has worked as a voice actor, including Mushu in Disney\'s Mulan (1998), Thurgood Stubbs in the sitcom The PJs (1999–2001), and Donkey in the Shrek franchise (2001–present), the latter of which earned him a BAFTA Award for Best Actor in a Supporting Role nomination. Murphy often takes on multiple roles in a single film, such as in Coming to America, Vampire in Brooklyn, the Nutty Professor films, Bowfinger, The Adventures of Pluto Nash and Norbit. This is intended as Murphy\'s tribute to one of his idols, Peter Sellers. Following a string of poorly received films, he had a career resurgence with leading roles in films such as Dolemite Is My Name (2019), Coming 2 America (2021), You People, Candy Cane Lane (both 2023) and Beverly Hills Cop: Axel F (2024). In 2020, he won his first Primetime Emmy Award for Outstanding Guest Actor in a Comedy Series for hosting Saturday Night Live. Murphy\'s films have grossed over $3.8 billion ($6.5 billion adjusted for inflation) in the United States and Canada box office, and over $6.7 billion worldwide. In 2015, his films made him the sixth-highest grossing actor in the United States. As a singer, Murphy has released three studio albums, including How Could It Be (1985), So Happy (1989), and Love\'s Alright (1993). He is also known for his 1985 single "Party All the Time", which peaked at number two on the Billboard Hot 100. {\'name\': \'Eddie Murphy\', \'caption\': "Murphy in 2010, at the premiere of \'\'Shrek Forever After\'\'", \'birth_name\': \'Edward Regan Murphy\', \'birth_date\': \'{{birth date and age|mf|=|yes|1961|4|3}}\', \'birth_place\': \'New York City, U.S.\', \'medium\': \'{{hlist|Stand-up|film|television|music}}\', \'active\': \'1976–present\', \'genre\': \'{{hlist|Observational comedy|musical comedy|blue comedy|black comedy|insult comedy|sketch comedy|racial humor|satire}}\', \'subject\': \'{{hlist|African-American culture|race relations|racism|marriage|sex|everyday life|pop culture}}\', \'spouse\': \'{{marriage|Nicole Mitchell|1993|2006|end|=|div}} <br> {{marriage|Paige Butcher|2024}}\', \'children\': \'10\', \'relatives\': \'Charlie Murphy (brother)\', \'module\': \'{{Infobox musical artist \\n| embed |=| yes\\n| genre |=| |hlist|R&B|funk|synth-pop|comedy|\\n| instruments |=| Vocals\\n| label |=| |hlist|Sony BMG|CBS|Columbia|Motown|\\n| associated_acts |=| Arsenio Hall <br />Rick James}} {{hlist|R&B|funk|synth-pop|comedy}} {{hlist|Sony BMG|CBS|Columbia|Motown}}\', \'aliases\': [\'Edward Murphy\', \'Edward Regan Murphy\']} {\'occupation\': \'film producer\', \'instance of\': \'human\', \'sex or gender\': \'male\', \'languages spoken, written or signed\': \'English\', \'described by source\': \'Obálky knih\'}',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 6,551 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 31 tokens
    • mean: 298.27 tokens
    • max: 384 tokens
    • min: 46 tokens
    • mean: 298.54 tokens
    • max: 384 tokens
    • min: 59 tokens
    • mean: 304.66 tokens
    • max: 384 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Caroline Records American record label Caroline Records is a record label that was founded in 1973. Founded in the United Kingdom to showcase British progressive rock groups, the label ceased releasing titles in 1976 and then re-emerged in the United States in 1986. The label released the work of American punk rock, thrash metal and new wave music bands. Caroline had a number of subsidiary labels, including Astralwerks, Gyroscope, Caroline Blue Plate, Beat the World, Scamp and Passenger. In 2013, the brand was relaunched by Universal Music via the Capitol Music Group. {'name': 'Caroline', 'parent': 'Universal Music Group', 'founded': '1973', 'founder': 'Richard Branson', 'distributor': 'Virgin Music Group, Universal Music Group, Capitol Music Group', 'status': 'Active', 'genre': 'Various', 'country': 'United Kingdom, United States', 'location': 'New York City, U.S.', 'aliases': ['Caroline']} {'instance of': 'record label', 'country': 'United States', 'genre': 'jazz', 'parent organizati... European Masters Games multi-sport event The European Masters Games (EMG) is a multi-sport event, consisting of summer sports, held every four years. The European Masters Games are owned by the International Masters Games Association (IMGA), owners of the World Masters Games. The age categories vary depending on the sport, but the competition is generally for people 30–35 years or older. The first games were held in 2008 in Malmö, Sweden. The European Masters Games are held once every four years, while the last games were held in 2015 in Nice, France. The next games will be celebrated in Turin, Italy, in 2019. The International Masters Games Association (IMGA), based in Lausanne, Switzerland, is the body responsible for the bidding and placing of the games. {'name': 'European Masters Games', 'formation': '2008', 'headquarters': 'IMGA', 'leader_name': 'The Duke of Sussex'} {'instance of': 'recurring sporting event', 'subclass of': 'recurring sporting event', 'event interval': '{"amount"... Clayography clay Animation technique Adam Benjamin Elliot is an Australian animator and filmmaker based in Melbourne. Established as an independent auteur of minimalistic narrative-driven films in animation, all of his films have generally use of tragicomedy genre with themes of bittersweet nature and psychological development to the characters; based loosely on his family and friends, each of his films is considered a Clayography – a portmanteau genre of clay animation and biography, coined by himself. {'name': 'Adam Elliot', 'caption': 'Elliot in 2010', 'birth_place': 'Berwick, Victoria, Australia', 'field': 'Clayographies – Clay Animated Biographies', 'training': 'The Victorian College of the Arts.', 'works': "''Mary and Max'', ''Harvie Krumpet'', ''Memoir of a Snail'', ''Ernie Biscuit'', ''Uncle'', ''Cousin'', ''Brother''", 'awards': 'Academy Award,\nFive Australian Film Institute Awards, Young Achiever of the Year for Victoria – 1999, Australian of the Year Award'} {'instance of':...
    political rights rights of citizens to participate, directly or indirectly, in the establishment or administration of government Civil and political rights are a class of rights that protect individuals' freedom from infringement by governments, social organizations, and private individuals. They ensure one's entitlement to participate in the civil and political life of society and the state. Civil rights generally include ensuring peoples' physical and mental integrity, life, and safety, protection from discrimination, the right to privacy, the freedom of thought, speech, religion, press, assembly, and movement. Political rights include natural justice (procedural fairness) in law, such as the rights of the accused, including the right to a fair trial; due process; the right to seek redress or a legal remedy; and rights of participation in civil society and politics such as freedom of association, the right to assemble, the right to petition, the right of self-defense, and the right t... quenelle gesture The quenelle (French: [kə.nɛl]) is a gesture created and popularized by French comedian Dieudonné M'bala M'bala. He first used it in 2005 in his sketch entitled "1905" about French secularism, and has used it since in a wide variety of contexts. The quenelle became popular, with many photos posted to the Internet showing individuals posing while performing quenelles at mundane places (wedding parties, high school classes, etc.). In late 2013, following its use by professional footballer Nicolas Anelka during a match, Jewish leaders, anti-racism groups, and public officials in France have interpreted it as an inverted Nazi salute and as an expression of antisemitism. French officials have sought to ban the gesture due to its perceived subtext of antisemitism. {'note': 'infobox not present in Wikipedia'} {'subclass of': 'gesture', 'instance of': 'gesture', 'uses': 'hand', 'described by source': 'Brockhaus and Efron Encyclopedic Dictionary'} organic architecture philosophy of architectural design Organic architecture is a philosophy of architecture which promotes harmony between human habitation and the natural world. This is achieved through design approaches that aim to be sympathetic and well-integrated with a site, so buildings, furnishings, and surroundings become part of a unified, interrelated composition. {'by': 'no', 'onlinebooks': 'no', 'about': 'yes', 'wikititle': 'organic architecture'} {'instance of': 'architectural style', 'on focus list of Wikimedia project': 'Wikipedia:Vital articles/Level/4', 'described by source': 'Encyclopædia Britannica 11th edition', 'subclass of': 'architectural style', 'part of': 'Gothic architecture'}
    nod gesture A nod of the head is a gesture in which the head is tilted in alternating up and down arcs along the sagittal plane. In many cultures, it is most commonly, but not universally, used to indicate agreement, acceptance, or acknowledgement. {'aliases': ['head nod']} {'instance of': 'gesture'} evocation act of calling upon or summoning a spirit, demon, god or other supernatural agent Evocation is the act of evoking, calling upon, or summoning a spirit, demon, deity or other supernatural agents, in the Western mystery tradition. Conjuration also refers to a summoning, often by the use of a magical spell. The conjuration of the ghosts or spirits of the dead for the purpose of divination is called necromancy. Comparable practices exist in many religions and magical traditions and may employ the use of mind-altering substances with and without uttered word formulas. {'aliases': ['summoning']} {'subclass of': 'ritual', 'instance of': 'ritual', 'described by source': 'Encyclopædia Britannica 11th edition', 'studied by': 'liturgics'} Poliçan city in Albania Poliçan (Albanian definite form: Poliçani) is a city and a municipality situated in south-central Albania. It was formed at the 2015 local government reform by the merger of the former municipalities Poliçan, Tërpan and Vërtop, that became municipal units. The seat of the municipality is the town Poliçan. The total population is 10,953 (2011 census), in a total area of 272.20 km2. The population of the former municipality at the 2011 census was 4,318. Historically, Polican has been part of Berat County except from 1978 to 1991 when city was made part of Skrapar due to political pressure from the regime at that time. It was established as an industrial center. {'type': 'm', 'name': 'Poliçan', 'flag': 'Flag of Poliçan.gif', 'emblem': 'Stema e Bashkisë Poliçan.svg', 'skyline': 'Poliçan.jpg', 'county': 'Berat', 'mayor': 'Adriatik Zotkaj', 'party': 'PS', 'coordinates': '{{coord
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 20
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.6105 500 0.2524
1.2210 1000 0.1646
1.8315 1500 0.1053
2.4420 2000 0.0746
3.0525 2500 0.0456
3.6630 3000 0.0301
4.2735 3500 0.0296
4.8840 4000 0.0204
5.4945 4500 0.014
6.1050 5000 0.0137
6.7155 5500 0.0086
7.3260 6000 0.0098
7.9365 6500 0.0087
8.5470 7000 0.0063
9.1575 7500 0.0047
9.7680 8000 0.0036
10.3785 8500 0.0031
10.9890 9000 0.004
11.5995 9500 0.0034
12.2100 10000 0.0026
12.8205 10500 0.002
13.4310 11000 0.0026
14.0415 11500 0.0019
14.6520 12000 0.0009
15.2625 12500 0.001
15.8730 13000 0.0007
16.4835 13500 0.0002
17.0940 14000 0.0003
17.7045 14500 0.0005
18.3150 15000 0.0001
18.9255 15500 0.0002
19.5360 16000 0.0

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
6
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alessio-Borgi/all-mpnet-base-v2-margin-based-triplet-loss-finetuned-culture-20-epochs-enhanced

Finetuned
(281)
this model