SentenceTransformer based on BAAI/bge-base-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Movies about the dark side of Hollywood fame and power abuse',
"Title: Frances\nGenres: Drama\nOverview: The true story of Frances Farmer's meteoric rise to fame in Hollywood and the tragic turn her life took when she was blacklisted.\nTagline: Her story is shocking, disturbing, compelling... and true.\nDirector: Graeme Clifford\nStars: Jessica Lange, Sam Shepard, Kim Stanley\nRelease Date: 1982-12-03\nKeywords: strong woman, falsely accused, insanity, movie business, feminism, biography, based on true story, evil mother, psychiatric hospital, female protagonist, hollywood, wrongful imprisonment, lost love, wrongful arrest, wrongful conviction, wrong diagnosis, lobotomy, frances farmer, power abuse, mother daughter relationship",
"Title: Come Drink with Me\nGenres: Action, Adventure\nOverview: Golden Swallow is a fighter-for-hire who has been contracted by the local government to retrieve the governor's kidnapped son. Holding him is a group of rebels who are demanding that their leader be released from prison in return for the captured son. After a brief encounter with the gang at a local restaurant, Golden Swallow is joined by an inebriated wanderer Drunken Cat who aids her in her mission.\nTagline: \nDirector: King Hu\nStars: Cheng Pei-Pei, Elliot Ngok Wah, Chen Hung-Lieh\nRelease Date: 1966-04-07\nKeywords: kung fu, hero, showdown, kidnapping, warrior woman, gore, fistfight, forest, waterfall, murder, tough girl, monastery, heroine, inn, severed hand, wuxia, kung fu master, inner strength, beggar clan, tavern fight",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 32,382 training samples
- Columns:
sentence_0
,sentence_1
, andsentence_2
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 8 tokens
- mean: 16.52 tokens
- max: 38 tokens
- min: 37 tokens
- mean: 151.92 tokens
- max: 330 tokens
- min: 48 tokens
- mean: 146.76 tokens
- max: 301 tokens
- Samples:
sentence_0 sentence_1 sentence_2 Something like a drama story dealing with disturbed teenager or life
Title: I Never Promised You a Rose Garden
Genres: Drama
Overview: A disturbed and institutionalized 16-year-old girl struggles between fantasy and reality.
Tagline: When she tried to kill herself, it was just the beginning.
Director: Anthony Page
Stars: Kathleen Quinlan, Bibi Andersson, Ben Piazza
Release Date: 1977-07-14
Keywords: disturbed teenagerTitle: Event Horizon
Genres: Horror, Science Fiction, Mystery
Overview: In 2047, a group of astronauts are sent to investigate and salvage the starship Event Horizon which disappeared mysteriously seven years before on its maiden voyage. However, it soon becomes evident that something sinister resides in its corridors.
Tagline: Infinite space. Infinite terror.
Director: Paul W. S. Anderson
Stars: Laurence Fishburne, Sam Neill, Kathleen Quinlan
Release Date: 1997-08-15
Keywords: space marine, nightmare, insanity, delusion, hallucination, space travel, cryogenics, gore, black hole, crew, flashback, evil spirit, alternate dimension, hellgate, religion, explosion, burning man, rescue team, super power, trapped in space, distress signal, 2040s, spaceshipStories of brave musketeers fighting against powerful adversaries for justice and love
Title: The Three Musketeers
Genres: Action, Adventure, Romance, Family
Overview: The young D'Artagnan arrives in Paris with dreams of becoming a King's musketeer. He meets and quarrels with three men, Athos, Porthos, and Aramis, each of whom challenges him to a duel. D'Artagnan finds out they are musketeers and is invited to join them in their efforts to oppose Cardinal Richelieu, who wishes to increase his already considerable power over the King. D'Artagnan must also juggle affairs with the charming Constance Bonancieux and the passionate Lady De Winter, a secret agent for the Cardinal.
Tagline: . . . One for All and All for Fun!
Director: Richard Lester
Stars: Michael York, Oliver Reed, Richard Chamberlain
Release Date: 1973-12-11
Keywords: france, paris, france, based on novel or book, swordplay, fight, satire, dressmaker, louis xiii, sword fight, swordsman, musketeer, extramarital affair, swashbuckler, diamond theft, sword duel, diamond necklace, cardinal, 17th century, queen jewe...Title: The Brood
Genres: Horror, Science Fiction
Overview: A man tries to uncover an unconventional psychologist's therapy techniques on his institutionalized wife, while a series of brutal attacks committed by a brood of mutant children coincides with the husband's investigation.
Tagline: The Ultimate Experience in Inner Terror.
Director: David Cronenberg
Stars: Oliver Reed, Samantha Eggar, Art Hindle
Release Date: 1979-05-25
Keywords: toronto, canada, mutant, transformation, psychologist, divorce, psychotherapist, canuxploitationCritically acclaimed drama films directed by Sarah Polley exploring the themes of illiteracy and based on novel or book
Title: Women Talking
Genres: Drama
Overview: A group of women in an isolated religious colony struggle to reconcile their faith with a series of sexual assaults committed by the colony's men.
Tagline: Do nothing. Stay and fight. Leave.
Director: Sarah Polley
Stars: Rooney Mara, Claire Foy, Jessie Buckley
Release Date: 2022-12-23
Keywords: rape, based on novel or book, faith, illiteracy, bolivia, mennonites, religion, gang rape, teenage rape, meeting, duringcreditsstinger, woman director, sexual assault, abusive husband, 2000s, pregnancy from rapeTitle: Alice in Wonderland
Genres: Family, Fantasy, Adventure
Overview: Alice, now 19 years old, returns to the whimsical world she first entered as a child and embarks on a journey to discover her true destiny.
Tagline: You're invited to a very important date.
Director: Tim Burton
Stars: Mia Wasikowska, Johnny Depp, Anne Hathaway
Release Date: 2010-03-03
Keywords: based on novel or book, queen, psychotic, fantasy world, taunting, live action remake, based on young adult novel, mischievous, absurd, dramatic, incredulous, amused, euphoric - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 32per_device_eval_batch_size
: 32num_train_epochs
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size
: 0fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.4941 | 500 | 0.796 |
0.9881 | 1000 | 0.517 |
1.4822 | 1500 | 0.3748 |
1.9763 | 2000 | 0.3682 |
2.4704 | 2500 | 0.2839 |
2.9644 | 3000 | 0.2849 |
3.4585 | 3500 | 0.2392 |
3.9526 | 4000 | 0.2373 |
Framework Versions
- Python: 3.11.12
- Sentence Transformers: 3.4.1
- Transformers: 4.51.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.6.0
- Datasets: 3.5.1
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 91
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for JJTsao/fine-tuned_movie_retriever-bge-base-en-v1.5-nl
Base model
BAAI/bge-base-en-v1.5