MS-MARCO Embeddings
Collection
Embedding models for MS-MARCO (Simple embedding models for RAG)
•
7 items
•
Updated
This is a sentence-transformers model finetuned from jxm/cde-small-v2. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({}) with Transformer model: ContextualDocumentEmbeddingTransformer
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("BlackBeenie/cde-small-v2-biencoder-msmarco")
# Run inference
sentences = [
'when did jeepers creepers come out',
'Jeepers Creepers Wiki. Creeper. Creeper is a fictional character and the main antagonist in the 2001 horror film Jeepers Creepers and its 2003 sequel Jeepers Creepers II. It is an ancient, mysterious demon who viciously feeds on the flesh and bones of many human beings for 23 days every 23rd spring.',
' Creep is a song by the English alternative rock band Radiohead. Radiohead released Creep as their debut single in 1992, and it later appeared on their first album, Pablo Honey (1993). During its initial release, Creep was not a chart success.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence_0
, sentence_1
, and sentence_2
sentence_0 | sentence_1 | sentence_2 | |
---|---|---|---|
type | string | string | string |
details |
|
|
|
sentence_0 | sentence_1 | sentence_2 |
---|---|---|
what year did the sandy hook incident happen |
For Newtown, 2012 Sandy Hook Elementary School shooting is still painful. It's been three years since the terrible day Jimmy Greeneâs 6-year-old daughter, Ana Grace Marquez, and 19 other children were murdered in the mass shooting at Sandy Hook Elementary School. But life without Ana, who loved to sing and dance from room to room, continues to be so hard that, in some ways, Dec. 14 is no tougher than any other day for Greene. |
Hook is a 1991 Steven Spielberg film starring Dustin Hoffman and Robin Williams. The film's storyline is based on the books written by Sir James Matthew Barrie in 1904 or 1905 and is the sequel to the first book. |
what kind of degree do you need to be a medical assistant? |
If you choose this path, here is what you need to do: 1 Have a high school diploma or GED. The minimum educational requirement for medical assistants is a high school diploma or equivalency degree. 2 Find a doctor who will provide training. |
Many colleges offer two-year associate's degrees or one-year certificate programs in different areas of medical office technology. Certificate areas include billing specialist, medical administrative assistant, and medical transcriptionist. Because of the complexity of medical jargon and operational procedures, many employers prefer these professionals to hold related two-year degrees or complete one-year training programs. |
what does usb cord do |
The Flash Player is required to see this video. The term USB stands for Universal Serial Bus. USB cable assemblies are some of the most popular cable types available, used mostly to connect computers to peripheral devices such as cameras, camcorders, printers, scanners, and more. Devices manufactured to the current USB Revision 3.0 specification are backward compatible with version 1.1. |
The USB 2.0 specification for a Full-Speed/High-Speed cable calls for four wires, two for data and two for power, and a braided outer shield. The USB 3.0 specification calls for a total of 10 wires plus a braided outer shield. Two wires are used for power. |
MultipleNegativesRankingLoss
with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
per_device_train_batch_size
: 32per_device_eval_batch_size
: 32fp16
: Truemulti_dataset_batch_sampler
: round_robinoverwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size
: 0fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
: auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robinEpoch | Step | Training Loss |
---|---|---|
0.0321 | 500 | 0.9856 |
0.0641 | 1000 | 0.4499 |
0.0962 | 1500 | 0.3673 |
0.1282 | 2000 | 0.339 |
0.1603 | 2500 | 0.3118 |
0.1923 | 3000 | 0.2929 |
0.2244 | 3500 | 0.2886 |
0.2564 | 4000 | 0.2771 |
0.2885 | 4500 | 0.2762 |
0.3205 | 5000 | 0.2716 |
0.3526 | 5500 | 0.2585 |
0.3846 | 6000 | 0.2631 |
0.4167 | 6500 | 0.2458 |
0.4487 | 7000 | 0.2496 |
0.4808 | 7500 | 0.252 |
0.5128 | 8000 | 0.2399 |
0.5449 | 8500 | 0.2422 |
0.5769 | 9000 | 0.2461 |
0.6090 | 9500 | 0.2314 |
0.6410 | 10000 | 0.2331 |
0.6731 | 10500 | 0.2314 |
0.7051 | 11000 | 0.2302 |
0.7372 | 11500 | 0.235 |
0.7692 | 12000 | 0.2176 |
0.8013 | 12500 | 0.2201 |
0.8333 | 13000 | 0.2206 |
0.8654 | 13500 | 0.222 |
0.8974 | 14000 | 0.2136 |
0.9295 | 14500 | 0.2108 |
0.9615 | 15000 | 0.2102 |
0.9936 | 15500 | 0.2098 |
1.0256 | 16000 | 0.1209 |
1.0577 | 16500 | 0.099 |
1.0897 | 17000 | 0.0944 |
1.1218 | 17500 | 0.0955 |
1.1538 | 18000 | 0.0947 |
1.1859 | 18500 | 0.0953 |
1.2179 | 19000 | 0.0943 |
1.25 | 19500 | 0.0911 |
1.2821 | 20000 | 0.0964 |
1.3141 | 20500 | 0.0933 |
1.3462 | 21000 | 0.0956 |
1.3782 | 21500 | 0.0941 |
1.4103 | 22000 | 0.0903 |
1.4423 | 22500 | 0.0889 |
1.4744 | 23000 | 0.0919 |
1.5064 | 23500 | 0.0917 |
1.5385 | 24000 | 0.0956 |
1.5705 | 24500 | 0.0903 |
1.6026 | 25000 | 0.0931 |
1.6346 | 25500 | 0.0931 |
1.6667 | 26000 | 0.089 |
1.6987 | 26500 | 0.0892 |
1.7308 | 27000 | 0.091 |
1.7628 | 27500 | 0.0892 |
1.7949 | 28000 | 0.0884 |
1.8269 | 28500 | 0.0889 |
1.8590 | 29000 | 0.0877 |
1.8910 | 29500 | 0.0866 |
1.9231 | 30000 | 0.0853 |
1.9551 | 30500 | 0.085 |
1.9872 | 31000 | 0.0867 |
2.0192 | 31500 | 0.055 |
2.0513 | 32000 | 0.0338 |
2.0833 | 32500 | 0.033 |
2.1154 | 33000 | 0.033 |
2.1474 | 33500 | 0.0317 |
2.1795 | 34000 | 0.0323 |
2.2115 | 34500 | 0.0322 |
2.2436 | 35000 | 0.0316 |
2.2756 | 35500 | 0.0314 |
2.3077 | 36000 | 0.0312 |
2.3397 | 36500 | 0.0324 |
2.3718 | 37000 | 0.0324 |
2.4038 | 37500 | 0.0328 |
2.4359 | 38000 | 0.0311 |
2.4679 | 38500 | 0.0312 |
2.5 | 39000 | 0.0312 |
2.5321 | 39500 | 0.0311 |
2.5641 | 40000 | 0.0315 |
2.5962 | 40500 | 0.0308 |
2.6282 | 41000 | 0.0308 |
2.6603 | 41500 | 0.0306 |
2.6923 | 42000 | 0.0313 |
2.7244 | 42500 | 0.0322 |
2.7564 | 43000 | 0.0315 |
2.7885 | 43500 | 0.0311 |
2.8205 | 44000 | 0.0321 |
2.8526 | 44500 | 0.0318 |
2.8846 | 45000 | 0.0305 |
2.9167 | 45500 | 0.031 |
2.9487 | 46000 | 0.032 |
2.9808 | 46500 | 0.0306 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}