metadata
base_model: sentence-transformers/all-mpnet-base-v2
datasets:
- sentence-transformers/squad
language:
- en
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:87599
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: What prompted transportation improvements in Portugal in the 1970's?
sentences:
- >-
Greenhouses convert solar light to heat, enabling year-round production
and the growth (in enclosed environments) of specialty crops and other
plants not naturally suited to the local climate. Primitive greenhouses
were first used during Roman times to produce cucumbers year-round for
the Roman emperor Tiberius. The first modern greenhouses were built in
Europe in the 16th century to keep exotic plants brought back from
explorations abroad. Greenhouses remain an important part of
horticulture today, and plastic transparent materials have also been
used to similar effect in polytunnels and row covers.
- >-
By the early 1970s Portugal's fast economic growth with increasing
consumption and purchase of new automobiles set the priority for
improvements in transportation. Again in the 1990s, after joining the
European Economic Community, the country built many new motorways.
Today, the country has a 68,732 km (42,708 mi) road network, of which
almost 3,000 km (1,864 mi) are part of system of 44 motorways. Opened in
1944, the first motorway (which linked Lisbon to the National Stadium)
was an innovative project that made Portugal among one of the first
countries in the world to establish a motorway (this roadway eventually
became the Lisbon-Cascais highway, or A5). But, although a few other
tracts were created (around 1960 and 1970), it was only after the
beginning of the 1980s that large-scale motorway construction was
implemented. In 1972, Brisa, the highway concessionaire, was founded to
handle the management of many of the regions motorways. On many
highways, toll needs to be paid, see Via Verde. Vasco da Gama bridge is
the longest bridge in Europe.
- >-
Kanye West began his early production career in the mid-1990s, making
beats primarily for burgeoning local artists, eventually developing a
style that involved speeding up vocal samples from classic soul records.
His first official production credits came at the age of nineteen when
he produced eight tracks on Down to Earth, the 1996 debut album of a
Chicago rapper named Grav. For a time, West acted as a ghost producer
for Deric "D-Dot" Angelettie. Because of his association with D-Dot,
West wasn't able to release a solo album, so he formed and became a
member and producer of the Go-Getters, a late-1990s Chicago rap group
composed of him, GLC, Timmy G, Really Doe, and Arrowstar. His group was
managed by John "Monopoly" Johnson, Don Crowley, and Happy Lewis under
the management firm Hustle Period. After attending a series of
promotional photo shoots and making some radio appearances, The
Go-Getters released their first and only studio album World Record
Holders in 1999. The album featured other Chicago-based rappers such as
Rhymefest, Mikkey Halsted, Miss Criss, and Shayla G. Meanwhile, the
production was handled by West, Arrowstar, Boogz, and Brian "All Day"
Miller.
- source_sentence: What did Virchow feel Darwin's conclusions lacked?
sentences:
- >-
Similar organizations in other countries followed: The American
Anthropological Association in 1902, the Anthropological Society of
Madrid (1865), the Anthropological Society of Vienna (1870), the Italian
Society of Anthropology and Ethnology (1871), and many others
subsequently. The majority of these were evolutionist. One notable
exception was the Berlin Society of Anthropology (1869) founded by
Rudolph Virchow, known for his vituperative attacks on the
evolutionists. Not religious himself, he insisted that Darwin's
conclusions lacked empirical foundation.
- >-
Russian Imperialism led to the Russian Empire's conquest of Central Asia
during the late 19th century's Imperial Era. Between 1864 and 1885
Russia gradually took control of the entire territory of Russian
Turkestan, the Tajikistan portion of which had been controlled by the
Emirate of Bukhara and Khanate of Kokand. Russia was interested in
gaining access to a supply of cotton and in the 1870s attempted to
switch cultivation in the region from grain to cotton (a strategy later
copied and expanded by the Soviets).[citation needed] By 1885
Tajikistan's territory was either ruled by the Russian Empire or its
vassal state, the Emirate of Bukhara, nevertheless Tajiks felt little
Russian influence.[citation needed]
- >-
A solar balloon is a black balloon that is filled with ordinary air. As
sunlight shines on the balloon, the air inside is heated and expands
causing an upward buoyancy force, much like an artificially heated hot
air balloon. Some solar balloons are large enough for human flight, but
usage is generally limited to the toy market as the surface-area to
payload-weight ratio is relatively high.
- source_sentence: What is the object of study for linguistic anthropology?
sentences:
- >-
Anthropology of development tends to view development from a critical
perspective. The kind of issues addressed and implications for the
approach simply involve pondering why, if a key development goal is to
alleviate poverty, is poverty increasing? Why is there such a gap
between plans and outcomes? Why are those working in development so
willing to disregard history and the lessons it might offer? Why is
development so externally driven rather than having an internal basis?
In short why does so much planned development fail?
- >-
The study of kinship and social organization is a central focus of
sociocultural anthropology, as kinship is a human universal.
Sociocultural anthropology also covers economic and political
organization, law and conflict resolution, patterns of consumption and
exchange, material culture, technology, infrastructure, gender
relations, ethnicity, childrearing and socialization, religion, myth,
symbols, values, etiquette, worldview, sports, music, nutrition,
recreation, games, food, festivals, and language (which is also the
object of study in linguistic anthropology).
- >-
On 1 February 1908, the king Dom Carlos I of Portugal and his heir
apparent, Prince Royal Dom Luís Filipe, Duke of Braganza, were murdered
in Lisbon. Under his rule, Portugal had twice been declared bankrupt –
on 14 June 1892, and again on 10 May 1902 – causing social turmoil,
economic disturbances, protests, revolts and criticism of the monarchy.
Manuel II of Portugal became the new king, but was eventually overthrown
by the 5 October 1910 revolution, which abolished the regime and
instated republicanism in Portugal. Political instability and economic
weaknesses were fertile ground for chaos and unrest during the
Portuguese First Republic. These conditions would lead to the failed
Monarchy of the North, 28 May 1926 coup d'état, and the creation of the
National Dictatorship (Ditadura Nacional).
- source_sentence: What is the official name of Portugal?
sentences:
- >-
Portugal (Portuguese: [puɾtuˈɣaɫ]), officially the Portuguese Republic
(Portuguese: República Portuguesa), is a country on the Iberian
Peninsula, in Southwestern Europe. It is the westernmost country of
mainland Europe, being bordered by the Atlantic Ocean to the west and
south and by Spain to the north and east. The Portugal–Spain border is
1,214 km (754 mi) long and considered the longest uninterrupted border
within the European Union. The republic also includes the Atlantic
archipelagos of the Azores and Madeira, both autonomous regions with
their own regional governments.
- >-
The large magnitude of solar energy available makes it a highly
appealing source of electricity. The United Nations Development
Programme in its 2000 World Energy Assessment found that the annual
potential of solar energy was 1,575–49,837 exajoules (EJ). This is
several times larger than the total world energy consumption, which was
559.8 EJ in 2012.
- >-
It was temporarily under the control of the Tibetan empire and Chinese
from 650–680 and then under the control of the Umayyads in 710. The
Samanid Empire, 819 to 999, restored Persian control of the region and
enlarged the cities of Samarkand and Bukhara (both cities are today part
of Uzbekistan) which became the cultural centers of Iran and the region
was known as Khorasan. The Kara-Khanid Khanate conquered Transoxania
(which corresponds approximately with modern-day Uzbekistan, Tajikistan,
southern Kyrgyzstan and southwest Kazakhstan) and ruled between
999–1211. Their arrival in Transoxania signaled a definitive shift from
Iranian to Turkic predominance in Central Asia, but gradually the
Kara-khanids became assimilated into the Perso-Arab Muslim culture of
the region.
- source_sentence: >-
During what years did the formation of the First Portuguese Republic take
place?
sentences:
- >-
Anthrozoology (also known as "human–animal studies") is the study of
interaction between living things. It is a burgeoning interdisciplinary
field that overlaps with a number of other disciplines, including
anthropology, ethology, medicine, psychology, veterinary medicine and
zoology. A major focus of anthrozoologic research is the quantifying of
the positive effects of human-animal relationships on either party and
the study of their interactions. It includes scholars from a diverse
range of fields, including anthropology, sociology, biology, and
philosophy.[n 7]
- >-
Professional anthropological bodies often object to the use of
anthropology for the benefit of the state. Their codes of ethics or
statements may proscribe anthropologists from giving secret briefings.
The Association of Social Anthropologists of the UK and Commonwealth
(ASA) has called certain scholarship ethically dangerous. The AAA's
current 'Statement of Professional Responsibility' clearly states that
"in relation with their own government and with host governments ... no
secret research, no secret reports or debriefings of any kind should be
agreed to or given."
- >-
Many Portuguese holidays, festivals and traditions have a Christian
origin or connotation. Although relations between the Portuguese state
and the Roman Catholic Church were generally amiable and stable since
the earliest years of the Portuguese nation, their relative power
fluctuated. In the 13th and 14th centuries, the church enjoyed both
riches and power stemming from its role in the reconquest, its close
identification with early Portuguese nationalism and the foundation of
the Portuguese educational system, including the first university. The
growth of the Portuguese overseas empire made its missionaries important
agents of colonization, with important roles in the education and
evangelization of people from all the inhabited continents. The growth
of liberal and nascent republican movements during the eras leading to
the formation of the First Portuguese Republic (1910–26) changed the
role and importance of organized religion.
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the squad dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("lizchu414/mpnet-base-all-nli-squad")
# Run inference
sentences = [
'During what years did the formation of the First Portuguese Republic take place?',
'Many Portuguese holidays, festivals and traditions have a Christian origin or connotation. Although relations between the Portuguese state and the Roman Catholic Church were generally amiable and stable since the earliest years of the Portuguese nation, their relative power fluctuated. In the 13th and 14th centuries, the church enjoyed both riches and power stemming from its role in the reconquest, its close identification with early Portuguese nationalism and the foundation of the Portuguese educational system, including the first university. The growth of the Portuguese overseas empire made its missionaries important agents of colonization, with important roles in the education and evangelization of people from all the inhabited continents. The growth of liberal and nascent republican movements during the eras leading to the formation of the First Portuguese Republic (1910–26) changed the role and importance of organized religion.',
'Professional anthropological bodies often object to the use of anthropology for the benefit of the state. Their codes of ethics or statements may proscribe anthropologists from giving secret briefings. The Association of Social Anthropologists of the UK and Commonwealth (ASA) has called certain scholarship ethically dangerous. The AAA\'s current \'Statement of Professional Responsibility\' clearly states that "in relation with their own government and with host governments ... no secret research, no secret reports or debriefings of any kind should be agreed to or given."',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
squad
- Dataset: squad at d84c8c2
- Size: 87,599 training samples
- Columns:
question
andanswer
- Approximate statistics based on the first 1000 samples:
question answer type string string details - min: 6 tokens
- mean: 14.46 tokens
- max: 31 tokens
- min: 34 tokens
- mean: 187.2 tokens
- max: 384 tokens
- Samples:
question answer To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?
Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.
What is in front of the Notre Dame Main Building?
Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.
The Basilica of the Sacred heart at Notre Dame is beside to which structure?
Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
squad
- Dataset: squad at d84c8c2
- Size: 87,599 evaluation samples
- Columns:
question
andanswer
- Approximate statistics based on the first 1000 samples:
question answer type string string details - min: 7 tokens
- mean: 13.84 tokens
- max: 31 tokens
- min: 28 tokens
- mean: 151.09 tokens
- max: 368 tokens
- Samples:
question answer What is one purpose of a greenhouse?
Greenhouses convert solar light to heat, enabling year-round production and the growth (in enclosed environments) of specialty crops and other plants not naturally suited to the local climate. Primitive greenhouses were first used during Roman times to produce cucumbers year-round for the Roman emperor Tiberius. The first modern greenhouses were built in Europe in the 16th century to keep exotic plants brought back from explorations abroad. Greenhouses remain an important part of horticulture today, and plastic transparent materials have also been used to similar effect in polytunnels and row covers.
What was one of the first uses of a greenhouse?
Greenhouses convert solar light to heat, enabling year-round production and the growth (in enclosed environments) of specialty crops and other plants not naturally suited to the local climate. Primitive greenhouses were first used during Roman times to produce cucumbers year-round for the Roman emperor Tiberius. The first modern greenhouses were built in Europe in the 16th century to keep exotic plants brought back from explorations abroad. Greenhouses remain an important part of horticulture today, and plastic transparent materials have also been used to similar effect in polytunnels and row covers.
Where were the first modern greenhouses built?
Greenhouses convert solar light to heat, enabling year-round production and the growth (in enclosed environments) of specialty crops and other plants not naturally suited to the local climate. Primitive greenhouses were first used during Roman times to produce cucumbers year-round for the Roman emperor Tiberius. The first modern greenhouses were built in Europe in the 16th century to keep exotic plants brought back from explorations abroad. Greenhouses remain an important part of horticulture today, and plastic transparent materials have also been used to similar effect in polytunnels and row covers.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Framework Versions
- Python: 3.12.7
- Sentence Transformers: 3.2.0
- Transformers: 4.45.2
- PyTorch: 2.2.2+cu121
- Accelerate: 1.0.1
- Datasets: 3.0.1
- Tokenizers: 0.20.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}