SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the ๐ค Hub
model = SentenceTransformer("jaimevera1107/all-MiniLM-L12-v2-pubmed")
# Run inference
sentences = [
"What were the findings of the study on cyclic 3',5'-nucleotide phosphodiesterase in bovine thyroid regarding its activity and the factors influencing it?",
"The study investigated the properties of cyclic 3',5'-nucleotide phosphodiesterase in bovine thyroid, revealing that its activity is stimulated by Mg2+ and requires a Ca2+-dependent activating factor, with distinct enzyme forms and kinetic behaviors observed.",
'[The kinetics of lithium in the rat serum, brain and liver]. The kinetics of lithium in the serum, liver and brain of rats is described. The serum levels resembled those of man, whereas considerable quantitative differences were observed when comparing specific kinetic parameters. The brain level increased with the increasing doses, approaching the corresponding serum level. Concentration differences between different brain areas could be observed only after repeated administrations. Striatum, cortex and hippocampus showed significantly higher levels than the thalamus. The liver content remained low with increasing doses, and was below the brain level.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 67,560 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 7 tokens
- mean: 74.75 tokens
- max: 256 tokens
- min: 6 tokens
- mean: 59.33 tokens
- max: 256 tokens
- min: 0.0
- mean: 0.68
- max: 1.0
- Samples:
sentence_0 sentence_1 label What were the outcomes for asthmatic patients in a trial of indoramin regarding airflow improvement and migraine frequency reduction?
Long-term trial of an alpha adrenoceptor blocking drug (Indoramin) in asthma. A preliminary report. Eight patients suffering from both asthma and migraine underwent a clinical trial for 3 months of indoramin, an alpha adrenoceptor antagonist with antihistamine and antiserotonin activity. Patients were told indoramin was prescribed for migraine prophylaxis. In three asthmatic patients there was a marked increase in airflow meter (AFM) readings which were recorded daily, the remaining five showing no significant change or a decrease in AFM readings. Indoramin did not appear to potentiate the action of the beta sympathomimetic aerosols. It is suggested that a small population of asthmatic patients may derive therapeutic benefit from an alpha adrenoceptor antagonist. Seven of the eight patients experienced a 50% decrease in the frequency of their migraine headaches.
0.5
The ontogeny of L-alpha-hydroxyacid oxidase isozymes in the mouse. Mouse liver hydroxyacid oxidase isozymes are present at low levels at birth and increase in activity until day 13 after which HAOX-B almost disappears and HZOS-A is reduced to approximately one half the maximum level in the adultkidney. HAOX-G appears near day 13 post partum and increases until adult levels are reached, the female having four times the activity of the male. The pregnant female has significantly lower levels of HAOX-A and HAOX-B in the liver and higher activity of HAOX-B in the kidney. Developmental changes occur in the extent of epigenetic modification of mouse liver HAOX-A during the early neonatal period.
In mice, liver hydroxyacid oxidase isozymes show developmental changes, with HAOX-B decreasing after day 13 and HAOX-G increasing, while pregnant females exhibit lower HAOX-A and HAOX-B levels in the liver but higher HAOX-B activity in the kidney.
1.0
What were the findings regarding renal inflammation and leptospires in a study of striped skunks from Louisiana?
In a study of 100 striped skunks from Louisiana, 50% exhibited renal inflammation, and 10% with severe lesions showed azotemia, while leptospires were cultured from 30% of the skunks.
0.5
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 32per_device_eval_batch_size
: 32num_train_epochs
: 4fp16
: Truemulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.2367 | 500 | 0.0659 |
0.4735 | 1000 | 0.042 |
0.7102 | 1500 | 0.0351 |
0.9470 | 2000 | 0.0328 |
1.1837 | 2500 | 0.0291 |
1.4205 | 3000 | 0.0269 |
1.6572 | 3500 | 0.0265 |
1.8939 | 4000 | 0.026 |
2.1307 | 4500 | 0.0245 |
2.3674 | 5000 | 0.0231 |
2.6042 | 5500 | 0.0219 |
2.8409 | 6000 | 0.0229 |
3.0777 | 6500 | 0.0227 |
3.3144 | 7000 | 0.0206 |
3.5511 | 7500 | 0.02 |
3.7879 | 8000 | 0.0201 |
Framework Versions
- Python: 3.11.9
- Sentence Transformers: 4.1.0
- Transformers: 4.52.3
- PyTorch: 2.7.0+cu118
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for jaimevera1107/all-MiniLM-L12-v2-pubmed
Base model
sentence-transformers/all-MiniLM-L6-v2