SentenceTransformer based on NeuML/pubmedbert-base-embeddings

This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings on the cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation and gene_description datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): MMContextEncoder(
    (text_encoder): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(30522, 768, padding_idx=0)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
    (pooling): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("jo-mengr/mmcontext-pubmedbert-scvi_fm-v3")
# Run inference
sentences = [
    'MALAT1 RPS27 RPL41 RPL10 RPL13 RPL21 TMSB4X RPL34 RPL13A RPS12 RPLP1 EEF1A1 RPL32 RPS6 RPS14 RPS27A RPS4X RPS29 RPLP2 RPS19 RPL11 RPL23A RPL31 RPS15A RPS3 RPL28 RPL27A RPL18A RPS23 RPL19 RPS28 RPS15 TMSB10 RPL7 RPL30 RPL3 RPS8 RPL35A RPS13 RPL26 RPL15 RPL9 RPL12 RPL10A RPL37 RPS20 RPS16 RPL18 RPS5 RPL36 RPS24 RPL8 RPL6 TPT1 RPL35 FAU RPL29 RPL37A RPSA RPL14 MT-CO3 RPL27 RPS7 RPL38',
    "This measurement was conducted with 10x 3' v2. Sample is a CD8-positive, alpha-beta T cell from a 29-year old Asian female with managed systemic lupus erythematosus (SLE). The cell was isolated from peripheral blood mononuclear cells.",
    "This measurement was conducted with 10x 3' v2. Classical monocyte cell sample from blood of a 64-year old female Asian individual with managed systemic lupus erythematosus (SLE).",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7308, 0.4065],
#         [0.7308, 1.0000, 0.5852],
#         [0.4065, 0.5852, 1.0000]])

Evaluation

Metrics

Triplet

  • Datasets: cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation_cell_sentence_2 and gene_description
  • Evaluated with TripletEvaluator
Metric cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation_cell_sentence_2 gene_description
cosine_accuracy 0.792 0.855

Training Details

Training Datasets

cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation

  • Dataset: cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation at d518eb2
  • Size: 81,143 training samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 365 characters
    • mean: 389.52 characters
    • max: 450 characters
    • min: 92 characters
    • mean: 216.13 characters
    • max: 900 characters
    • min: 103 characters
    • mean: 212.72 characters
    • max: 1186 characters
    • min: 358 characters
    • mean: 389.11 characters
    • max: 433 characters
  • Samples:
    anchor positive negative_1 negative_2
    TMSB4X TMSB10 ACTB RPL13A MT-CO3 MALAT1 GNLY RPS15A RPS27 NKG7 IFITM2 RPL12 RPL23A MT-CO2 RPS19 RPS3 RPLP2 RPL28 RPL6 LGALS1 RPL21 RPS6 RPLP1 GZMA EEF1A1 RPL26 RPL37A RPS29 PFN1 RPL34 RPS15 RPS24 RPL11 RPL32 HMGB2 FTH1 RPS23 PTMA MT-CO1 RPL39 RPS20 HSP90AA1 GZMB RPL19 ARHGDIB HNRNPA2B1 PLAAT4 RPS8 RPL37 RPL10 FAU CMC1 RPL41 VIM RPL31 RPL3 MYL12A RPS16 RPL5 CBX3 ATP5F1E HCST RPL27 RPL35 This measurement was conducted with 10x 3' v2. A proliferating lymphocyte cell sample, obtained from a 34-year-old female Asian individual, derived from peripheral blood mononuclear cells. This measurement was conducted with 10x 3' v2. Sample is a CD8-positive, alpha-beta T cell derived from a 31-year-old Asian female's peripheral blood mononuclear cells. MALAT1 RPS27 RPL41 RPL34 RPL21 RPL10 TMSB4X RPL13 RPL13A RPL32 RPS12 RPLP1 RPS29 RPS14 RPS6 EEF1A1 RPS27A RPLP2 RPS19 RPS4X RPS28 RPL39 RPS15A RPL11 RPL27A RPL23A RPS15 RPL18A RPL12 RPL31 RPL26 RPL28 RPL19 RPS8 RPS3 RPL3 RPL36 RPL7 RPL30 RPS23 TMSB10 RPL37 RPL35A RPS13 RPL15 RPL10A MT-CO3 RPS20 RPL18 RPL35 RPL9 RPS16 RPS24 RPS21 RPL37A MT-CO2 RPL29 RPS5 RPL6 RPL8 RPL38 RPL14 MT-CO1 RPL27
    EEF1A1 MALAT1 RPL10 RPS27 RPS12 RPLP1 MT2A RPL41 RPL39 RPL30 MT-ND4L FTH1 RPL13 MT-CO2 RPL32 JUNB RPL28 RPS19 RPL34 TPT1 RPS28 RPS15A RPS27A MT-CYB RPS3 RPS23 RPS4X RPL11 RPS8 RPS14 RPS15 RPL37 RPL5 RPS21 RPS13 FOS RPL19 MT-ND3 RPS29 RPL26 RPL3 RPL18A RPL8 MT-CO1 TMSB10 RPL35A RPL14 RPS6 RPL29 MT-ATP8 RPLP2 RPL36 BTG1 RPL23A RPL18 RPL6 RPSA TMSB4X ZFP36L2 NACA PABPC1 ACTB RPS7 MT-CO3 This measurement was conducted with 10x 5' v1. Sample is a cell from the omentum tissue, specifically an effector memory CD4-positive, alpha-beta T cell, from a female in her sixth decade. This measurement was conducted with 10x 5' v1. Sample is a CD4-positive helper T cell, specifically Trm_Th1/Th17 subset, derived from the duodenum tissue of a male individual in his sixth decade. MALAT1 MT-ATP6 MT-CO2 RPLP1 MT-CO1 TPT1 MT-CO3 RPS27 MT-CYB MT-ND3 RPL41 RPL10 MT-ND4 EEF1A1 VIM JUND TMSB4X RPS12 RPL13 PTMA RPL39 FTH1 RPS27A RPL30 RPS29 RPL32 RPL34 RPS19 RPL28 RPS15A RPL21 RPL37 MT-ND2 CRIP1 ANXA1 RPL11 RPS14 RPS28 RPS6 RPS8 RPS3 EIF1 RPS23 RPS13 RPS24 UBC MT-ND1 RPL19 RPS15 H3-3B RPL26 RPL9 RPS21 RPLP2 RPL35A RPL37A RPL12 RPS4X ACTB RPL3 RPS16 SRGN RPL36 RPL13A
    MALAT1 GRIK1 SYT1 PCDH9 RORA NRG1 CADPS ZFPM2 LRRC4C LINGO2 RALYL PTPRD SPHKAP CNTNAP5 SLC8A1 CCSER1 HDAC9 CELF2 R3HDM1 CNTN4 RBMS3 PCDH7 GALNT13 UNC5D ROBO1 SYNPR SNAP25 GPM6A ANK3 FRMPD4 CHRM2 RYR2 KHDRBS2 CADM1 CACNA1D RGS6 PDE4D DOCK4 UNC13C CDH18 FAT3 MEG3 NR2F2-AS1 HMCN1 GULP1 CAMK2D ZEB1 SYN2 DYNC1I1 OXR1 DPP10 OSBPL6 FRAS1 PPP3CA ZNF385D ZMAT4 PCBP3 HS6ST3 ERC2 PLEKHA5 CDK14 MAP2 NCOA1 ATP8A2 This measurement was conducted with 10x 3' v3. Neuron cell type from a 29-year-old male, specifically from the thalamic complex, specifically the thalamus (THM) - posterior nuclear complex of thalamus (PoN) - medial geniculate nuclei (MG). This measurement was conducted with 10x 3' v3. Astrocyte cell type from the thalamic complex, specifically from the thalamus (THM) - posterior nuclear complex of thalamus (PoN) - medial geniculate nuclei (MG) region, of a 42-year-old male. MALAT1 PCDH9 PLP1 MBP ST18 QKI PDE4B RNF220 PTPRD SEPTIN7 TTLL7 NCKAP5 GPM6B PIP4K2A MOBP SLC44A1 PTGDS PLCL1 MAP7 ELMO1 SIK3 FTH1 TMTC2 ZBTB20 MAN2A1 TMEM165 DOCK10 TCF12 EDIL3 ZEB2 DPYD MAP4K4 PHLPP1 TF GAB1 TRIM2 FRMD4B DNAJC6 MARCHF1 ANK3 DST AGAP1 TMEM144 NEAT1 PLEKHH1 DLG1 CRYAB ERBIN RTN4 SPP1 ATP8A1 DOCK4 SLAIN1 APP DOCK5 APBB2 SAMD12 SHTN1 ZNF536 ZFYVE16 ARAP2 LIMCH1 HIPK2 BCAS1
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

gene_description

  • Dataset: gene_description at dd22363
  • Size: 116,208 training samples
  • Columns: anchor, positive, and negative_1
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1
    type string string string
    details
    • min: 3 characters
    • mean: 5.88 characters
    • max: 12 characters
    • min: 16 characters
    • mean: 367.09 characters
    • max: 1375 characters
    • min: 13 characters
    • mean: 167.33 characters
    • max: 1375 characters
  • Samples:
    anchor positive negative_1
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] A1BG antisense RNA 1
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] G antigen 12D
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] G antigen 12B
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Datasets

cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation

  • Dataset: cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation at d518eb2
  • Size: 9,011 evaluation samples
  • Columns: anchor, positive, negative_1, and negative_2
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1 negative_2
    type string string string string
    details
    • min: 363 characters
    • mean: 390.19 characters
    • max: 437 characters
    • min: 99 characters
    • mean: 209.99 characters
    • max: 941 characters
    • min: 101 characters
    • mean: 208.8 characters
    • max: 728 characters
    • min: 356 characters
    • mean: 390.44 characters
    • max: 433 characters
  • Samples:
    anchor positive negative_1 negative_2
    MT-CO1 MT-CO2 MT-ND3 MT-CO3 MT-ND4 MT-ATP6 MT-CYB MT-ND2 MALAT1 MT-ND1 RPL41 EEF1A1 RPS12 RPLP1 MT-ND4L RPS24 MT-ND5 RPL34 RPS27 RPL10 RPL32 RPL13 RPL11 RPS28 RPL28 RPS15 RPL36 RPL30 RPS27A RPL26 FTH1 TMSB4X ACTB FTL RPL24 RTN4 RPS15A RPS14 RPL3 RPS19 ATP6V0B TPT1 RPS8 FAU RPL8 RPS23 S100A6 RPL18 RPL18A RPL19 RPS13 RPL35 RPS6 RPL37 RPS3 RPL35A NDUFA4 RPS4X RPL21 ATP5F1E COX7C ITM2B RPL29 IGFBP7 This measurement was conducted with 10x 3' v3. Cell sample from the cortex of kidney, taken from a 43-year-old male of European ethnicity with a reported history of kidney cancer. The cell type is identified as a kidney collecting duct intercalated cell. This measurement was conducted with 10x 3' v3. Cell sample from the cortex of kidney, taken from a 72-year-old male of European ethnicity, identified as a kidney collecting duct intercalated cell, and preserved through cryopreservation. MALAT1 MT-CO1 MT-ND3 MT-ATP6 MT-CO2 MT-CO3 MT-ND4 MT-CYB TMSB4X MT-ND2 RPS27 TMSB10 RPL10 MT-ND1 RPL41 ACTB TXNIP RPS27A RPS12 EEF1A1 RPL13 RPS19 RPL32 RPS3 RPL28 RPS15A RPLP1 RPS24 RPL30 RPS29 RPL34 RPL11 RPL26 RPS23 MT-ND5 RPL19 RPS15 RPL18 RPL3 TPT1 RPL37 RPS14 RPS28 PFN1 BTG1 RPS6 FAU RPL9 RPL15 PTMA S100A4 MT-ND4L ATP5F1E RPS8 RPL27A RPS21 RPS7 EIF1 RPL14 RPL12 RPS4X RPL23A FTL RPL18A
    MALAT1 KCND2 NRXN1 CDH18 NRXN3 ZNF385D CADM2 RALYL NKAIN2 CADPS2 RIMS1 FSTL5 GRID2 TRPM3 CHN2 DPP6 JMJD1C RORA PDE1A UNC13C TIAM1 NRG1 SNAP25 ZFPM2 CALN1 LSAMP CNTN1 ABLIM1 SYNE1 ANK3 CA10 NFIA ZBTB20 NTM CADM1 OPCML RELN DNM3 MT-CO3 NEBL ERC1 SCN2A PPP3CA CACNA1A GALNT13 LRRC4C GPM6A RABGAP1L RIT2 CAMK4 GRIA4 PTPRD RBFOX3 MCTP1 LHFPL6 PCLO MEG3 PDE10A NOVA1 RTN1 ZNF385B CNTN4 GABRB2 SPOCK1 This measurement was conducted with 10x 3' v3. Neuron cell type from a 29-year-old male cerebellum, specifically from the Cerebellar Vermis - CBV region, with European self-reported ethnicity, analyzed at the nucleus level. This measurement was conducted with 10x 3' v3. Sample is an oligodendrocyte precursor cell taken from the cerebellum tissue of a 42-year-old human male, specifically from the Cerebellum (CB) - Cerebellar Vermis - CBV dissection. MALAT1 NRXN3 SNTG1 UNC5C GRIA4 NRG1 RORA INPP4B CLSTN2 NKAIN2 FRMD4A DPP6 GRID2 NRXN1 LSAMP JMJD1C HS6ST3 NXPH1 MIR99AHG LRRC4C NTM CCNH NFIA ZFPM2 AFF3 OPCML PTPRT CADM2 ZBTB20 OLFM3 SLC22A3 CNTNAP5 CACNA2D3 CNTN4 KCND2 ADARB2 XKR4 GPM6A IL1RAPL1 ALK ANKRD36C UBE2E2 SYN3 GARNL3 PTPRG DAB1 TCF4 LINC00461 PRANCR GRIN2B TNRC6B MAPK10 NOVA1 NFIB ANK3 KCNMA1 KCNQ5 SPON1 TRIM9 VWA8 GDAP1 GABRG2 AHI1 ATP1B1
    EEF1A1 RPL28 RPLP1 RPS8 RPL10 ACTB RPL41 RPS4X GAPDH RPS27 RPS15A RPS23 RPS12 RPS3 RPLP0 RPS7 RPL11 RPL32 RPS24 RPL12 HMGN2 RPS19 RPL34 RPS28 RPL8 PTMA RPS13 RPL19 RPL37 RPL30 RPL6 RPS14 RPL15 SERF2 RPL18A RPLP2 TMSB4X RPS6 CD74 RPL29 RPL13 RPL18 RPS15 RPSA RPL26 PABPC1 RPS27A FTH1 RPL5 TMSB10 RPS21 RPL14 FAU RPL23A PFN1 RPL35A RPS5 RPS16 HMGN1 OAZ1 HMGB1 TPT1 PPIA NACA This measurement was conducted with 10x 5' v1. Cell sample from the tonsil of a 9-year-old female with recurrent tonsillitis, characterized as a centroblast B cell with IGLC2, IGLV7-43, IGLJ3 immunoglobulin genes expressed. This measurement was conducted with 10x 5' v1. Germinal center B cell derived from the tonsil tissue of a 3-year-old male with recurrent tonsillitis. CD74 RPL10 MALAT1 EEF1A1 RPLP1 RPL28 RPL41 RPL13 RPS8 SSR4 TPT1 RPLP0 RPS15A RPL18A UBC RPL37 RPS12 EEF2 RPL19 RPS4X RPL3 RPS27 RPS23 RPL11 RPS28 SAT1 RPS3 RPL34 RPS13 RACK1 RPL29 RPL32 RPS7 RPS19 RPL18 RPL8 RPL30 RPL12 RPS15 RPS14 RPS6 SEC11C RPL15 RPS5 ATP5MG RPL23A RPL35A RPS27A FAU TSC22D3 RPL6 PPIB XBP1 FTL GAPDH RPL5 HLA-DRB5 RPL14 HERPUD1 RGS2 HSPA8 RPL36 RPL26 RPL9
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

gene_description

  • Dataset: gene_description at dd22363
  • Size: 1,000 evaluation samples
  • Columns: anchor, positive, and negative_1
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_1
    type string string string
    details
    • min: 3 characters
    • mean: 5.88 characters
    • max: 12 characters
    • min: 16 characters
    • mean: 367.09 characters
    • max: 1375 characters
    • min: 13 characters
    • mean: 167.33 characters
    • max: 1375 characters
  • Samples:
    anchor positive negative_1
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] A1BG antisense RNA 1
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] G antigen 12D
    A1BG The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. [provided by RefSeq, Jul 2008] G antigen 12B
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • bf16: True
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss cellxgene pseudo bulk 100k multiplets natural language annotation loss gene description loss cellxgene_pseudo_bulk_100k_multiplets_natural_language_annotation_cell_sentence_2_cosine_accuracy gene_description_cosine_accuracy
0.0324 50 11.1118 19.5753 5.9302 0.5109 0.1640
0.0649 100 8.5954 18.3310 5.4544 0.5140 0.1800
0.0973 150 9.2422 15.3028 4.9547 0.5157 0.2050
0.1297 200 7.4027 12.1164 4.6149 0.5179 0.3010
0.1621 250 6.3683 8.7628 4.4274 0.5128 0.3680
0.1946 300 4.8876 7.1226 4.3115 0.5173 0.4290
0.2270 350 4.2794 6.1769 4.1100 0.5200 0.5230
0.2594 400 3.9819 5.5841 4.0913 0.5491 0.5470
0.2918 450 3.3978 5.4411 3.9073 0.5835 0.5910
0.3243 500 3.382 5.3812 3.7190 0.6073 0.6380
0.3567 550 3.3258 5.1994 3.6217 0.6317 0.6570
0.3891 600 3.1445 5.0669 3.5130 0.6525 0.6790
0.4215 650 2.821 5.1486 3.4302 0.6592 0.6960
0.4540 700 3.1259 5.1082 3.3893 0.6745 0.6940
0.4864 750 2.5501 5.0555 3.3233 0.6844 0.6980
0.5188 800 2.7482 4.8770 3.2845 0.6963 0.7350
0.5512 850 3.0687 4.8827 3.2678 0.7028 0.7250
0.5837 900 2.7547 4.7859 3.2468 0.7092 0.7140
0.6161 950 2.5732 4.7323 3.2219 0.7160 0.7300
0.6485 1000 2.5944 4.8185 3.1714 0.7169 0.7540
0.6809 1050 2.5687 4.6262 3.1597 0.7253 0.7360
0.7134 1100 2.8425 4.6943 3.1093 0.7343 0.7560
0.7458 1150 2.3715 4.6413 3.1107 0.7327 0.7480
0.7782 1200 2.6028 4.5452 3.1065 0.7397 0.7490
0.8106 1250 2.6916 4.5529 3.0629 0.7426 0.7710
0.8431 1300 2.5536 4.5393 3.0937 0.7442 0.7730
0.8755 1350 2.3964 4.5170 3.0533 0.7474 0.7750
0.9079 1400 2.5294 4.4737 3.0284 0.7514 0.7810
0.9403 1450 2.3428 4.5252 3.0048 0.7523 0.7860
0.9728 1500 2.2832 4.4570 3.0046 0.7546 0.7880
1.0052 1550 2.4838 4.4645 2.9743 0.7560 0.7870
1.0376 1600 2.2069 4.4958 2.9889 0.7565 0.7900
1.0700 1650 2.1644 4.4804 2.9450 0.7577 0.8090
1.1025 1700 2.2339 4.4097 2.9550 0.7584 0.7950
1.1349 1750 2.3097 4.4550 2.9476 0.7589 0.7940
1.1673 1800 2.0396 4.4098 2.9459 0.7584 0.7960
1.1997 1850 2.2754 4.3819 2.9214 0.7630 0.8090
1.2322 1900 2.2027 4.4073 2.8998 0.7635 0.8160
1.2646 1950 2.233 4.3522 2.9309 0.7637 0.7990
1.2970 2000 2.1282 4.3914 2.9119 0.7657 0.8030
1.3294 2050 2.2827 4.4009 2.9088 0.7621 0.8100
1.3619 2100 2.1032 4.3749 2.9004 0.7674 0.8090
1.3943 2150 2.1256 4.3401 2.8910 0.7680 0.8090
1.4267 2200 2.2666 4.3397 2.8794 0.7701 0.8150
1.4591 2250 2.3281 4.3065 2.8766 0.7704 0.8150
1.4916 2300 2.1385 4.3089 2.8434 0.7718 0.8220
1.5240 2350 2.2675 4.3343 2.8382 0.7732 0.8210
1.5564 2400 2.2412 4.2702 2.8591 0.7741 0.8270
1.5888 2450 2.0092 4.2566 2.8626 0.7739 0.8250
1.6213 2500 2.2628 4.2259 2.8382 0.7771 0.8320
1.6537 2550 2.3358 4.2075 2.8568 0.7775 0.8250
1.6861 2600 2.139 4.2953 2.8455 0.7786 0.8290
1.7185 2650 2.2749 4.2156 2.8392 0.7807 0.8330
1.7510 2700 2.1997 4.2526 2.8198 0.7792 0.8370
1.7834 2750 2.1923 4.3028 2.8413 0.7809 0.8310
1.8158 2800 2.384 4.1491 2.8303 0.7823 0.8370
1.8482 2850 2.211 4.2045 2.8420 0.7840 0.8300
1.8807 2900 2.1251 4.1696 2.8533 0.7868 0.8290
1.9131 2950 2.1539 4.1611 2.8321 0.7842 0.8380
1.9455 3000 2.1108 4.1235 2.8206 0.7870 0.8420
1.9780 3050 2.2329 4.1143 2.8159 0.7873 0.8370
2.0104 3100 2.107 4.1063 2.8296 0.7856 0.8510
2.0428 3150 2.0815 4.0980 2.8250 0.7880 0.8510
2.0752 3200 2.1147 4.1009 2.8179 0.7862 0.8520
2.1077 3250 2.1254 4.0894 2.8121 0.7877 0.8540
2.1401 3300 2.2891 4.1078 2.8076 0.7857 0.8540
2.1725 3350 1.9332 4.1062 2.8099 0.7877 0.8520
2.2049 3400 2.0915 4.0826 2.8105 0.7884 0.8500
2.2374 3450 2.1009 4.0940 2.8079 0.7884 0.8490
2.2698 3500 1.9798 4.0965 2.8021 0.7885 0.8490
2.3022 3550 1.9953 4.0991 2.8020 0.7871 0.8500
2.3346 3600 2.0243 4.0925 2.8069 0.7881 0.8490
2.3671 3650 1.9352 4.0702 2.8065 0.7878 0.8470
2.3995 3700 2.0431 4.0910 2.8070 0.7877 0.8510
2.4319 3750 2.1696 4.0813 2.7993 0.7898 0.8530
2.4643 3800 1.9443 4.0904 2.8072 0.7873 0.8480
2.4968 3850 2.2002 4.0618 2.8043 0.7886 0.8490
2.5292 3900 2.1554 4.0779 2.8028 0.7894 0.8510
2.5616 3950 2.0185 4.0936 2.8081 0.7896 0.8510
2.5940 4000 1.9604 4.0973 2.8034 0.7895 0.8530
2.6265 4050 2.1299 4.0703 2.7996 0.7903 0.8530
2.6589 4100 1.9768 4.0632 2.7984 0.7885 0.8550
2.6913 4150 2.1236 4.0532 2.7967 0.7894 0.8490
2.7237 4200 2.1007 4.0455 2.7914 0.7885 0.8530
2.7562 4250 2.0482 4.0679 2.7918 0.7904 0.8470
2.7886 4300 1.9541 4.0671 2.7904 0.7906 0.8490
2.8210 4350 2.0531 4.0699 2.7902 0.7900 0.8500
2.8534 4400 1.9997 4.0799 2.7870 0.7885 0.8500
2.8859 4450 1.9374 4.0731 2.7884 0.7886 0.8480
2.9183 4500 2.0898 4.0449 2.7937 0.7895 0.8510
2.9507 4550 2.0351 4.0502 2.8023 0.7901 0.8470
2.9831 4600 1.9308 4.0406 2.7928 0.7896 0.8510
3.0156 4650 2.3701 4.0345 2.7892 0.7910 0.8510
3.0480 4700 1.9955 4.0689 2.7872 0.7888 0.8510
3.0804 4750 1.9005 4.0190 2.7872 0.7925 0.8510
3.1128 4800 2.1007 4.0551 2.7921 0.7897 0.8500
3.1453 4850 1.9132 4.0367 2.7896 0.7916 0.8500
3.1777 4900 1.9924 4.0449 2.7923 0.7905 0.8490
3.2101 4950 2.146 4.0392 2.7914 0.7901 0.8510
3.2425 5000 2.0803 4.0458 2.7926 0.7913 0.8470
3.2750 5050 2.0173 4.0445 2.7913 0.7908 0.8490
3.3074 5100 2.0224 4.0406 2.7910 0.7933 0.8480
3.3398 5150 1.9332 4.0337 2.7837 0.7917 0.8530
3.3722 5200 2.0368 4.0272 2.7780 0.7905 0.8530
3.4047 5250 2.1804 4.0399 2.7819 0.7926 0.8500
3.4371 5300 2.0873 4.0408 2.7759 0.7926 0.8580
3.4695 5350 2.1205 4.0551 2.7746 0.7894 0.8560
3.5019 5400 1.945 4.0467 2.7791 0.7917 0.8540
3.5344 5450 2.1594 4.0339 2.7767 0.7929 0.8540
3.5668 5500 2.2175 4.0215 2.7804 0.7917 0.8520
3.5992 5550 1.9389 4.0251 2.7794 0.7916 0.8480
3.6316 5600 1.8196 4.0301 2.7805 0.7917 0.8500
3.6641 5650 1.8026 4.0289 2.7784 0.7908 0.8500
3.6965 5700 1.9885 4.0219 2.7775 0.7917 0.8530
3.7289 5750 2.137 4.0052 2.7749 0.7926 0.8550
3.7613 5800 2.14 4.0050 2.7785 0.7917 0.8550
3.7938 5850 2.1486 4.0081 2.7785 0.7923 0.8530
3.8262 5900 2.0139 4.0139 2.7787 0.7916 0.8540
3.8586 5950 2.1015 4.0230 2.7789 0.7925 0.8520
3.8911 6000 1.791 4.0231 2.7764 0.7925 0.8540
3.9235 6050 1.9892 4.0208 2.7763 0.7924 0.8540
3.9559 6100 2.0315 4.0217 2.7762 0.7923 0.8560
3.9883 6150 2.0294 4.0220 2.7764 0.7920 0.8550

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 5.0.0
  • Transformers: 4.55.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.9.0
  • Datasets: 2.19.1
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jo-mengr/mmcontext-pubmedbert-with-ribo

Evaluation results

  • Cosine Accuracy on cellxgene pseudo bulk 100k multiplets natural language annotation cell sentence 2
    self-reported
    0.792
  • Cosine Accuracy on gene description
    self-reported
    0.855