SentenceTransformer based on Shuu12121/CodeModernBERT-Owl-2.2-Pre
This is a sentence-transformers model finetuned from Shuu12121/CodeModernBERT-Owl-2.2-Pre. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Shuu12121/CodeModernBERT-Owl-2.2-Pre
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'<pre>\nField extraction metadata on the property.\n</pre>\n\n<code>.google.cloud.documentai.v1beta3.FieldExtractionMetadata field_extraction_metadata = 9;\n</code>\n\n@return Whether the fieldExtractionMetadata field is set.',
'@java.lang.Override\n public boolean hasFieldExtractionMetadata() {\n return ((bitField0_ & 0x00000001) != 0);\n }',
'pub fn poller(self) -> impl lro::Poller<(), crate::model::DeleteSitemapMetadata> {\n type Operation =\n lro::internal::Operation<wkt::Empty, crate::model::DeleteSitemapMetadata>;\n let polling_error_policy = self.0.stub.get_polling_error_policy(&self.0.options);\n let polling_backoff_policy = self.0.stub.get_polling_backoff_policy(&self.0.options);\n\n let stub = self.0.stub.clone();\n let mut options = self.0.options.clone();\n options.set_retry_policy(gax::retry_policy::NeverRetry);\n let query = move |name| {\n let stub = stub.clone();\n let options = options.clone();\n async {\n let op = GetOperation::new(stub)\n .set_name(name)\n .with_options(options)\n .send()\n .await?;\n Ok(Operation::new(op))\n }\n };\n\n let start = move || async {\n let op = self.send().await?;\n Ok(Operation::new(op))\n };\n\n lro::internal::new_unit_response_poller(\n polling_error_policy,\n polling_backoff_policy,\n start,\n query,\n )\n }',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 2,732,400 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 8 tokens
- mean: 70.49 tokens
- max: 1024 tokens
- min: 5 tokens
- mean: 138.25 tokens
- max: 1024 tokens
- min: 1.0
- mean: 1.0
- max: 1.0
- Samples:
sentence_0 sentence_1 label Prints the specified
pkg
.
Ifis_main
is not set, nested package notation is used.pub fn print_package(
&mut self,
resolve: &Resolve,
pkg: PackageId,
is_main: bool,
) -> Result<()> {
let pkg = &resolve.packages[pkg];
self.print_package_outer(pkg)?;
if is_main {
self.output.semicolon();
self.output.newline();
} else {
self.output.indent_start();
}
for (name, id) in pkg.interfaces.iter() {
self.print_interface_outer(resolve, *id, name)?;
self.output.indent_start();
self.print_interface(resolve, *id)?;
self.output.indent_end();
if is_main {
self.output.newline();
}
}
for (name, id) in pkg.worlds.iter() {
self.print_docs(&resolve.worlds[*id].docs);
self.print_stability(&resolve.worlds[*id].stability);
self.output.keyword("world");
self.output.str(" ");
self.print_name_type(name, TypeKind:...1.0
An alternative descriptive name for the user.
pub fn nick_name(mut self, input: impl ::std::convert::Into<::std::string::String>) -> Self {
self.nick_name = ::std::option::Option::Some(input.into());
self
}1.0
Indicates whether the match is case sensitive.
pub fn case_sensitive(mut self, input: bool) -> Self {
self.case_sensitive = ::std::option::Option::Some(input);
self
}1.0
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 150per_device_eval_batch_size
: 150num_train_epochs
: 5fp16
: Truemulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 150per_device_eval_batch_size
: 150per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Click to expand
Epoch | Step | Training Loss |
---|---|---|
0.0274 | 500 | 0.7202 |
0.0549 | 1000 | 0.1625 |
0.0823 | 1500 | 0.149 |
0.1098 | 2000 | 0.1388 |
0.1372 | 2500 | 0.1292 |
0.1647 | 3000 | 0.126 |
0.1921 | 3500 | 0.1204 |
0.2196 | 4000 | 0.1161 |
0.2470 | 4500 | 0.1074 |
0.2745 | 5000 | 0.1063 |
0.3019 | 5500 | 0.1004 |
0.3294 | 6000 | 0.0972 |
0.3568 | 6500 | 0.0941 |
0.3843 | 7000 | 0.0944 |
0.4117 | 7500 | 0.0884 |
0.4392 | 8000 | 0.0895 |
0.4666 | 8500 | 0.0867 |
0.4941 | 9000 | 0.0847 |
0.5215 | 9500 | 0.0805 |
0.5490 | 10000 | 0.0822 |
0.5764 | 10500 | 0.0784 |
0.6039 | 11000 | 0.0741 |
0.6313 | 11500 | 0.0734 |
0.6588 | 12000 | 0.0719 |
0.6862 | 12500 | 0.0687 |
0.7137 | 13000 | 0.0656 |
0.7411 | 13500 | 0.0681 |
0.7686 | 14000 | 0.0655 |
0.7960 | 14500 | 0.067 |
0.8235 | 15000 | 0.0628 |
0.8509 | 15500 | 0.0619 |
0.8783 | 16000 | 0.0592 |
0.9058 | 16500 | 0.0604 |
0.9332 | 17000 | 0.0585 |
0.9607 | 17500 | 0.0545 |
0.9881 | 18000 | 0.0543 |
1.0156 | 18500 | 0.0381 |
1.0430 | 19000 | 0.0263 |
1.0705 | 19500 | 0.0247 |
1.0979 | 20000 | 0.026 |
1.1254 | 20500 | 0.0263 |
1.1528 | 21000 | 0.0267 |
1.1803 | 21500 | 0.0277 |
1.2077 | 22000 | 0.0269 |
1.2352 | 22500 | 0.027 |
1.2626 | 23000 | 0.0274 |
1.2901 | 23500 | 0.0275 |
1.3175 | 24000 | 0.0283 |
1.3450 | 24500 | 0.0271 |
1.3724 | 25000 | 0.0269 |
1.3999 | 25500 | 0.0272 |
1.4273 | 26000 | 0.0263 |
1.4548 | 26500 | 0.0266 |
1.4822 | 27000 | 0.0259 |
1.5097 | 27500 | 0.0272 |
1.5371 | 28000 | 0.0277 |
1.5646 | 28500 | 0.0273 |
1.5920 | 29000 | 0.0251 |
1.6195 | 29500 | 0.0256 |
1.6469 | 30000 | 0.0256 |
1.6744 | 30500 | 0.0248 |
1.7018 | 31000 | 0.0253 |
1.7292 | 31500 | 0.0244 |
1.7567 | 32000 | 0.0242 |
1.7841 | 32500 | 0.0219 |
1.8116 | 33000 | 0.0246 |
1.8390 | 33500 | 0.023 |
1.8665 | 34000 | 0.0239 |
1.8939 | 34500 | 0.0217 |
1.9214 | 35000 | 0.0217 |
1.9488 | 35500 | 0.0224 |
1.9763 | 36000 | 0.0223 |
2.0037 | 36500 | 0.0201 |
2.0312 | 37000 | 0.0102 |
2.0586 | 37500 | 0.0097 |
2.0861 | 38000 | 0.009 |
2.1135 | 38500 | 0.0092 |
2.1410 | 39000 | 0.0094 |
2.1684 | 39500 | 0.0096 |
2.1959 | 40000 | 0.0101 |
2.2233 | 40500 | 0.0101 |
2.2508 | 41000 | 0.0099 |
2.2782 | 41500 | 0.01 |
2.3057 | 42000 | 0.01 |
2.3331 | 42500 | 0.01 |
2.3606 | 43000 | 0.0099 |
2.3880 | 43500 | 0.0098 |
2.4155 | 44000 | 0.0099 |
2.4429 | 44500 | 0.0101 |
2.4704 | 45000 | 0.0098 |
2.4978 | 45500 | 0.01 |
2.5253 | 46000 | 0.0099 |
2.5527 | 46500 | 0.0096 |
2.5801 | 47000 | 0.0092 |
2.6076 | 47500 | 0.0091 |
2.6350 | 48000 | 0.009 |
2.6625 | 48500 | 0.0091 |
2.6899 | 49000 | 0.0092 |
2.7174 | 49500 | 0.0095 |
2.7448 | 50000 | 0.0089 |
2.7723 | 50500 | 0.0093 |
2.7997 | 51000 | 0.0097 |
2.8272 | 51500 | 0.0092 |
2.8546 | 52000 | 0.0093 |
2.8821 | 52500 | 0.0091 |
2.9095 | 53000 | 0.0091 |
2.9370 | 53500 | 0.0089 |
2.9644 | 54000 | 0.0084 |
2.9919 | 54500 | 0.0078 |
3.0193 | 55000 | 0.0063 |
3.0468 | 55500 | 0.0046 |
3.0742 | 56000 | 0.0047 |
3.1017 | 56500 | 0.0051 |
3.1291 | 57000 | 0.0049 |
3.1566 | 57500 | 0.0049 |
3.1840 | 58000 | 0.0051 |
3.2115 | 58500 | 0.0048 |
3.2389 | 59000 | 0.0053 |
3.2664 | 59500 | 0.0049 |
3.2938 | 60000 | 0.0049 |
3.3213 | 60500 | 0.005 |
3.3487 | 61000 | 0.0055 |
3.3762 | 61500 | 0.0052 |
3.4036 | 62000 | 0.005 |
3.4310 | 62500 | 0.0049 |
3.4585 | 63000 | 0.0051 |
3.4859 | 63500 | 0.005 |
3.5134 | 64000 | 0.005 |
3.5408 | 64500 | 0.005 |
3.5683 | 65000 | 0.0046 |
3.5957 | 65500 | 0.0049 |
3.6232 | 66000 | 0.0045 |
3.6506 | 66500 | 0.0044 |
3.6781 | 67000 | 0.0046 |
3.7055 | 67500 | 0.0049 |
3.7330 | 68000 | 0.0049 |
3.7604 | 68500 | 0.0042 |
3.7879 | 69000 | 0.0042 |
3.8153 | 69500 | 0.0046 |
3.8428 | 70000 | 0.0049 |
3.8702 | 70500 | 0.0042 |
3.8977 | 71000 | 0.0041 |
3.9251 | 71500 | 0.0043 |
3.9526 | 72000 | 0.0042 |
3.9800 | 72500 | 0.0041 |
4.0075 | 73000 | 0.004 |
4.0349 | 73500 | 0.0031 |
4.0624 | 74000 | 0.0031 |
4.0898 | 74500 | 0.003 |
4.1173 | 75000 | 0.003 |
4.1447 | 75500 | 0.0029 |
4.1722 | 76000 | 0.0031 |
4.1996 | 76500 | 0.0029 |
4.2271 | 77000 | 0.003 |
4.2545 | 77500 | 0.0029 |
4.2819 | 78000 | 0.0029 |
4.3094 | 78500 | 0.0027 |
4.3368 | 79000 | 0.0028 |
4.3643 | 79500 | 0.0028 |
4.3917 | 80000 | 0.003 |
4.4192 | 80500 | 0.0027 |
4.4466 | 81000 | 0.0027 |
4.4741 | 81500 | 0.003 |
4.5015 | 82000 | 0.0028 |
4.5290 | 82500 | 0.0029 |
4.5564 | 83000 | 0.0027 |
4.5839 | 83500 | 0.0027 |
4.6113 | 84000 | 0.0029 |
4.6388 | 84500 | 0.0026 |
4.6662 | 85000 | 0.0027 |
4.6937 | 85500 | 0.0027 |
4.7211 | 86000 | 0.0025 |
4.7486 | 86500 | 0.0029 |
4.7760 | 87000 | 0.0027 |
4.8035 | 87500 | 0.0026 |
4.8309 | 88000 | 0.0028 |
4.8584 | 88500 | 0.0025 |
4.8858 | 89000 | 0.0024 |
4.9133 | 89500 | 0.0027 |
4.9407 | 90000 | 0.0026 |
4.9682 | 90500 | 0.0026 |
4.9956 | 91000 | 0.0028 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Shuu12121/CodeSearch-ModernBERT-Owl-2.2
Base model
Shuu12121/CodeModernBERT-Owl-2.2-Pre