splade-co-condenser-marco trained on MS MARCO hard negatives with distillation
This is a SPLADE Sparse Encoder model finetuned from Luyu/co-condenser-marco on the msmarco dataset using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
Model Details
Model Description
- Model Type: SPLADE Sparse Encoder
- Base model: Luyu/co-condenser-marco
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 30522 dimensions
- Similarity Function: Dot Product
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Sparse Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sparse Encoders on Hugging Face
Full Model Architecture
SparseEncoder(
(0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
(1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse")
# Run inference
queries = [
"what town is grand lake st mary near",
]
documents = [
'Grand Lake St. Marys State Park. Grand Lake St. Marys State Park is an American state park, west of St. Marys, and south-east of Celina, 23 miles (37 km) south-west of Lima in the north-western part of Ohio. Grand Lake covers 13,500 acres (5,500 ha) in Mercer and Auglaize counties.',
'Lake Poinsett. Home > Florida Lakes > Lake Poinsett. Lake Poinsett BASS ONLINE 2016-10-18T14:26:01+00:00. Lake Poinsett Fishing. As the St. Johns River snakes out of Lake Washington and through the lush, green marshes, it eventually forms a â\x80\x98minorâ\x80\x99 wide spot in its trace some eight miles to the North.',
'Slavery in America began when the first African slaves were brought to the North American colony of Jamestown, Virginia, in 1619, to aid in the production of such lucrative crops as tobacco.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[23.0166, 12.0320, 1.7877]])
Evaluation
Metrics
Sparse Information Retrieval
- Datasets:
NanoMSMARCO
,NanoNFCorpus
,NanoNQ
,NanoClimateFEVER
,NanoDBPedia
,NanoFEVER
,NanoFiQA2018
,NanoHotpotQA
,NanoMSMARCO
,NanoNFCorpus
,NanoNQ
,NanoQuoraRetrieval
,NanoSCIDOCS
,NanoArguAna
,NanoSciFact
andNanoTouche2020
- Evaluated with
SparseInformationRetrievalEvaluator
Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
dot_accuracy@1 | 0.38 | 0.36 | 0.5 | 0.32 | 0.78 | 0.76 | 0.4 | 0.84 | 0.82 | 0.42 | 0.14 | 0.52 | 0.6939 |
dot_accuracy@3 | 0.68 | 0.5 | 0.74 | 0.48 | 0.86 | 0.9 | 0.5 | 0.94 | 1.0 | 0.58 | 0.46 | 0.72 | 0.8163 |
dot_accuracy@5 | 0.76 | 0.62 | 0.78 | 0.58 | 0.9 | 0.92 | 0.62 | 0.96 | 1.0 | 0.66 | 0.58 | 0.74 | 0.9184 |
dot_accuracy@10 | 0.82 | 0.7 | 0.82 | 0.72 | 0.96 | 0.96 | 0.74 | 0.98 | 1.0 | 0.82 | 0.74 | 0.84 | 0.9796 |
dot_precision@1 | 0.38 | 0.36 | 0.5 | 0.32 | 0.78 | 0.76 | 0.4 | 0.84 | 0.82 | 0.42 | 0.14 | 0.52 | 0.6939 |
dot_precision@3 | 0.2267 | 0.32 | 0.2533 | 0.1733 | 0.68 | 0.3067 | 0.2133 | 0.4733 | 0.3867 | 0.2733 | 0.1533 | 0.26 | 0.6054 |
dot_precision@5 | 0.152 | 0.328 | 0.16 | 0.14 | 0.608 | 0.192 | 0.168 | 0.32 | 0.252 | 0.228 | 0.116 | 0.164 | 0.5551 |
dot_precision@10 | 0.082 | 0.264 | 0.088 | 0.098 | 0.516 | 0.104 | 0.106 | 0.172 | 0.134 | 0.164 | 0.074 | 0.094 | 0.4714 |
dot_recall@1 | 0.38 | 0.024 | 0.46 | 0.159 | 0.0805 | 0.7167 | 0.2261 | 0.42 | 0.744 | 0.0877 | 0.14 | 0.5 | 0.0472 |
dot_recall@3 | 0.68 | 0.0792 | 0.69 | 0.2357 | 0.1699 | 0.8533 | 0.2985 | 0.71 | 0.938 | 0.1697 | 0.46 | 0.71 | 0.1243 |
dot_recall@5 | 0.76 | 0.0983 | 0.73 | 0.2957 | 0.2371 | 0.8833 | 0.4133 | 0.8 | 0.9687 | 0.2347 | 0.58 | 0.725 | 0.189 |
dot_recall@10 | 0.82 | 0.1246 | 0.79 | 0.3847 | 0.3389 | 0.9433 | 0.527 | 0.86 | 0.99 | 0.3357 | 0.74 | 0.83 | 0.3121 |
dot_ndcg@10 | 0.6108 | 0.3105 | 0.6426 | 0.325 | 0.634 | 0.8472 | 0.4163 | 0.7969 | 0.9198 | 0.326 | 0.4314 | 0.6742 | 0.534 |
dot_mrr@10 | 0.5427 | 0.4639 | 0.6181 | 0.4319 | 0.834 | 0.8369 | 0.4877 | 0.895 | 0.9033 | 0.5275 | 0.3336 | 0.6251 | 0.7833 |
dot_map@100 | 0.551 | 0.1358 | 0.5913 | 0.2561 | 0.4791 | 0.8096 | 0.3523 | 0.7206 | 0.8869 | 0.2485 | 0.3456 | 0.6257 | 0.394 |
query_active_dims | 47.4 | 46.64 | 55.28 | 110.88 | 48.7 | 76.32 | 52.36 | 81.46 | 50.98 | 77.1 | 184.38 | 93.36 | 47.4898 |
query_sparsity_ratio | 0.9984 | 0.9985 | 0.9982 | 0.9964 | 0.9984 | 0.9975 | 0.9983 | 0.9973 | 0.9983 | 0.9975 | 0.994 | 0.9969 | 0.9984 |
corpus_active_dims | 140.1564 | 215.3624 | 160.6441 | 192.7664 | 181.0865 | 218.5546 | 134.6856 | 189.121 | 54.8137 | 191.1029 | 172.4982 | 216.2216 | 139.9833 |
corpus_sparsity_ratio | 0.9954 | 0.9929 | 0.9947 | 0.9937 | 0.9941 | 0.9928 | 0.9956 | 0.9938 | 0.9982 | 0.9937 | 0.9943 | 0.9929 | 0.9954 |
Sparse Nano BEIR
- Dataset:
NanoBEIR_mean
- Evaluated with
SparseNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ] }
Metric | Value |
---|---|
dot_accuracy@1 | 0.4133 |
dot_accuracy@3 | 0.6267 |
dot_accuracy@5 | 0.7133 |
dot_accuracy@10 | 0.7867 |
dot_precision@1 | 0.4133 |
dot_precision@3 | 0.2644 |
dot_precision@5 | 0.2107 |
dot_precision@10 | 0.1447 |
dot_recall@1 | 0.2879 |
dot_recall@3 | 0.4693 |
dot_recall@5 | 0.5227 |
dot_recall@10 | 0.5778 |
dot_ndcg@10 | 0.5177 |
dot_mrr@10 | 0.5369 |
dot_map@100 | 0.4213 |
query_active_dims | 49.6067 |
query_sparsity_ratio | 0.9984 |
corpus_active_dims | 164.0576 |
corpus_sparsity_ratio | 0.9946 |
Sparse Nano BEIR
- Dataset:
NanoBEIR_mean
- Evaluated with
SparseNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "climatefever", "dbpedia", "fever", "fiqa2018", "hotpotqa", "msmarco", "nfcorpus", "nq", "quoraretrieval", "scidocs", "arguana", "scifact", "touche2020" ] }
Metric | Value |
---|---|
dot_accuracy@1 | 0.5334 |
dot_accuracy@3 | 0.7059 |
dot_accuracy@5 | 0.7722 |
dot_accuracy@10 | 0.8523 |
dot_precision@1 | 0.5334 |
dot_precision@3 | 0.3327 |
dot_precision@5 | 0.2602 |
dot_precision@10 | 0.1821 |
dot_recall@1 | 0.3065 |
dot_recall@3 | 0.4707 |
dot_recall@5 | 0.5319 |
dot_recall@10 | 0.6151 |
dot_ndcg@10 | 0.5745 |
dot_mrr@10 | 0.6371 |
dot_map@100 | 0.492 |
query_active_dims | 74.8382 |
query_sparsity_ratio | 0.9975 |
corpus_active_dims | 164.4836 |
corpus_sparsity_ratio | 0.9946 |
Training Details
Training Dataset
msmarco
- Dataset: msmarco at 9e329ed
- Size: 90,000 training samples
- Columns:
query
,positive
,negative
, andscore
- Approximate statistics based on the first 1000 samples:
query positive negative score type string string string float details - min: 4 tokens
- mean: 9.05 tokens
- max: 22 tokens
- min: 17 tokens
- mean: 79.74 tokens
- max: 228 tokens
- min: 14 tokens
- mean: 77.68 tokens
- max: 256 tokens
- min: -3.38
- mean: 10.51
- max: 21.0
- Samples:
query positive negative score journal entries for standard cost variances
1 Fiber Optic, Inc., investigates all variances above 10 percent of the flexible budget. 2 The flexible budget for direct materials is $50,000. 3 The direct materials price variance is $4,000 unfavorable and the direct materials quantity variance is $(6,000) favorable. Assuming a standard price of $5 per yard, prepare a journal entry to record the purchase of raw materials for the month. 2 The company used 39,000 yards of material in production for the month, and the flexible budget shows the company expected to use 40,800 yards.
In accounting the monthly close is the processing of transactions, journal entries and financial statements at the end of each month.
9.375
what county in pana, il in?
Pana /ËpeɪnÉ/ is a city in Christian County, Illinois, United States. The population was 5,614 at the 2000 census.
Burr Ridge, IL is currently using an area code overlay in which area codes 331 and 630 serve the same geographic area. Ten digit dialing (area code + seven digit number) is necessary. In addition to Burr Ridge, IL area code information read more about area codes 331 and 630 details and Illinois area codes. Burr Ridge, IL is located in DuPage County and observes the Central Time Zone. View our Times by Area Code tool.
13.75
when was keep on loving you released
Share this page. REO's first Top 40 appearance proved to be a fruitful one, with the group taking Keep on Loving You to the number one spot in December of 1980.
Description: âIf Loving You Is Wrongâ is the new dramatic series created for television by writer/director Tyler Perry, premiering this fall on OWN. âIf Loving You Is Wrongâ is the compelling story of several women from very different walks of life.ack to the Have and The Have Nots, the scenes are too long and the characters are one dimensional. If it wasn't for Tika Sumpter, the show would be unbearable to watch. Love Thy Neighbor is the worst show ever. It is a throwback to blackface mistrel shows.
8.875
- Loss:
SpladeLoss
with these parameters:{ "loss": "SparseMarginMSELoss", "document_regularizer_weight": 0.08, "query_regularizer_weight": 0.1 }
Evaluation Dataset
msmarco
- Dataset: msmarco at 9e329ed
- Size: 10,000 evaluation samples
- Columns:
query
,positive
,negative
, andscore
- Approximate statistics based on the first 1000 samples:
query positive negative score type string string string float details - min: 4 tokens
- mean: 9.22 tokens
- max: 43 tokens
- min: 19 tokens
- mean: 80.19 tokens
- max: 209 tokens
- min: 14 tokens
- mean: 77.78 tokens
- max: 239 tokens
- min: -9.0
- mean: 10.74
- max: 21.75
- Samples:
query positive negative score what trump said about obama playing golf during campaign
Obama also has played golf with Woods during his presidency, though typically the presidentâs golf partners are personal friends and select aides, as opposed to celebrities. At a campaign rally in December 2015, Trump ripped into Obama for playing hundreds of rounds of golf as president. âHe played more golf last year than Tiger Woods,â Trump said suggestively. âWe donât have time for this. We have to work.â.
Trump slams Obama, Clinton for 'politically correct' war against ISIS, warns of more attacks. Republican presidential nominee Donald Trump has accused the Obama administration of waging a 'politically correct' war against the ISIS terror group and warned that more terror attacks would take place.
8.421875
how much volume is a gram
One gram is equal to 0.0353 ounces. A gram of sugar is approximately 1/4 teaspoon of sugar. A regular paper clip weighs about 1 gram. The gram and kilogram are units of mass in the metric system of measurement. The metric system was invented in France in 1799. It was improved in 1960 and named the System of International Units, or SI.
Divide the object's mass by its volume. This value is the object's density and expresses it in units of mass per unit of volume. For example, for a 20-gram mass that takes up a volume of 5 cubic centimeters, the density is 4 grams per cubic centimeter.Ad.ivide the object's mass by its volume. This value is the object's density and expresses it in units of mass per unit of volume. For example, for a 20-gram mass that takes up a volume of 5 cubic centimeters, the density is 4 grams per cubic centimeter. Ad.
2.65625
differences between the sexes
sexual dimorphism in birds can be manifested in size or plumage differences between the sexes sexual size dimorphism varies among taxa with males typically being larger though this is not always the case i e birds of prey hummingbirds and some species of flightless birds
Caribou are the only species of deer in which both sexes have antlers. Mature bulls can carry enormous and complex antlers, whereas cows and young animals generally have smaller and simpler ones. Mature bulls usually shed their antlers shortly after the rut whereas cows can keep theirs until spring.
10.21875
- Loss:
SpladeLoss
with these parameters:{ "loss": "SparseMarginMSELoss", "document_regularizer_weight": 0.08, "query_regularizer_weight": 0.1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1bf16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportionalrouter_mapping
: {}learning_rate_mapping
: {}
Training Logs
Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_dot_ndcg@10 | NanoNFCorpus_dot_ndcg@10 | NanoNQ_dot_ndcg@10 | NanoBEIR_mean_dot_ndcg@10 | NanoClimateFEVER_dot_ndcg@10 | NanoDBPedia_dot_ndcg@10 | NanoFEVER_dot_ndcg@10 | NanoFiQA2018_dot_ndcg@10 | NanoHotpotQA_dot_ndcg@10 | NanoQuoraRetrieval_dot_ndcg@10 | NanoSCIDOCS_dot_ndcg@10 | NanoArguAna_dot_ndcg@10 | NanoSciFact_dot_ndcg@10 | NanoTouche2020_dot_ndcg@10 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0178 | 100 | 576200.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0356 | 200 | 2635.0334 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0533 | 300 | 70.7781 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0711 | 400 | 46.7365 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0889 | 500 | 33.3391 | 46.8835 | 0.5158 | 0.2778 | 0.6192 | 0.4709 | - | - | - | - | - | - | - | - | - | - |
0.1067 | 600 | 29.4815 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1244 | 700 | 27.123 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1422 | 800 | 22.7267 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.16 | 900 | 22.2125 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1778 | 1000 | 23.7129 | 22.1341 | 0.5768 | 0.2807 | 0.5689 | 0.4754 | - | - | - | - | - | - | - | - | - | - |
0.1956 | 1100 | 23.1061 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2133 | 1200 | 23.3015 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2311 | 1300 | 19.0495 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2489 | 1400 | 20.465 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2667 | 1500 | 19.5227 | 18.4953 | 0.5447 | 0.2930 | 0.5663 | 0.4680 | - | - | - | - | - | - | - | - | - | - |
0.2844 | 1600 | 19.7019 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3022 | 1700 | 20.2723 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.32 | 1800 | 18.644 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3378 | 1900 | 17.8863 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3556 | 2000 | 17.824 | 21.6579 | 0.5722 | 0.2951 | 0.5739 | 0.4804 | - | - | - | - | - | - | - | - | - | - |
0.3733 | 2100 | 18.2091 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3911 | 2200 | 17.9996 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4089 | 2300 | 15.7506 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4267 | 2400 | 17.8921 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4444 | 2500 | 16.3761 | 20.0396 | 0.5493 | 0.2811 | 0.6257 | 0.4854 | - | - | - | - | - | - | - | - | - | - |
0.4622 | 2600 | 18.1791 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.48 | 2700 | 15.3429 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4978 | 2800 | 14.9936 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5156 | 2900 | 15.364 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5333 | 3000 | 15.6449 | 17.3149 | 0.5672 | 0.3030 | 0.6095 | 0.4932 | - | - | - | - | - | - | - | - | - | - |
0.5511 | 3100 | 15.6673 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5689 | 3200 | 15.0578 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5867 | 3300 | 15.906 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6044 | 3400 | 15.6495 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6222 | 3500 | 13.6636 | 14.5839 | 0.5683 | 0.2978 | 0.6191 | 0.4951 | - | - | - | - | - | - | - | - | - | - |
0.64 | 3600 | 14.7215 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6578 | 3700 | 15.1004 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6756 | 3800 | 13.7198 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6933 | 3900 | 13.9975 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7111 | 4000 | 13.5657 | 14.8618 | 0.5983 | 0.3042 | 0.6183 | 0.5069 | - | - | - | - | - | - | - | - | - | - |
0.7289 | 4100 | 13.8326 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7467 | 4200 | 14.5209 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7644 | 4300 | 13.4064 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7822 | 4400 | 13.7625 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8 | 4500 | 13.2154 | 14.3594 | 0.5734 | 0.3266 | 0.6345 | 0.5115 | - | - | - | - | - | - | - | - | - | - |
0.8178 | 4600 | 13.7091 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8356 | 4700 | 12.5913 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8533 | 4800 | 12.433 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8711 | 4900 | 13.0404 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8889 | 5000 | 12.409 | 14.0825 | 0.6108 | 0.3105 | 0.6426 | 0.5213 | - | - | - | - | - | - | - | - | - | - |
0.9067 | 5100 | 12.4556 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9244 | 5200 | 12.4219 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9422 | 5300 | 12.4269 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.96 | 5400 | 12.5363 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9778 | 5500 | 12.4979 | 13.8156 | 0.6024 | 0.3101 | 0.6405 | 0.5177 | - | - | - | - | - | - | - | - | - | - |
0.9956 | 5600 | 11.9616 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
-1 | -1 | - | - | 0.6108 | 0.3105 | 0.6426 | 0.5745 | 0.3250 | 0.6340 | 0.8472 | 0.4163 | 0.7969 | 0.9198 | 0.3260 | 0.4314 | 0.6742 | 0.5340 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.13.3
- Sentence Transformers: 4.2.0.dev0
- Transformers: 4.53.0
- PyTorch: 2.7.1+cu126
- Accelerate: 1.8.1
- Datasets: 3.6.0
- Tokenizers: 0.21.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
SpladeLoss
@misc{formal2022distillationhardnegativesampling,
title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
year={2022},
eprint={2205.04733},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2205.04733},
}
SparseMarginMSELoss
@misc{hofstätter2021improving,
title={Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation},
author={Sebastian Hofstätter and Sophia Althammer and Michael Schröder and Mete Sertkan and Allan Hanbury},
year={2021},
eprint={2010.02666},
archivePrefix={arXiv},
primaryClass={cs.IR}
}
FlopsLoss
@article{paria2020minimizing,
title={Minimizing flops to learn efficient sparse representations},
author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
journal={arXiv preprint arXiv:2004.05665},
year={2020}
}
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse
Base model
Luyu/co-condenser-marcoDataset used to train arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse
Evaluation results
- Dot Accuracy@1 on NanoMSMARCOself-reported0.380
- Dot Accuracy@3 on NanoMSMARCOself-reported0.660
- Dot Accuracy@5 on NanoMSMARCOself-reported0.740
- Dot Accuracy@10 on NanoMSMARCOself-reported0.820
- Dot Precision@1 on NanoMSMARCOself-reported0.380
- Dot Precision@3 on NanoMSMARCOself-reported0.220
- Dot Precision@5 on NanoMSMARCOself-reported0.148
- Dot Precision@10 on NanoMSMARCOself-reported0.082
- Dot Recall@1 on NanoMSMARCOself-reported0.380
- Dot Recall@3 on NanoMSMARCOself-reported0.660