splade-co-condenser-marco trained on MS MARCO hard negatives with distillation

This is a SPLADE Sparse Encoder model finetuned from Luyu/co-condenser-marco on the msmarco dataset using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: SPLADE Sparse Encoder
  • Base model: Luyu/co-condenser-marco
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 30522 dimensions
  • Similarity Function: Dot Product
  • Training Dataset:
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
  (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse")
# Run inference
queries = [
    "what town is grand lake st mary near",
]
documents = [
    'Grand Lake St. Marys State Park. Grand Lake St. Marys State Park is an American state park, west of St. Marys, and south-east of Celina, 23 miles (37 km) south-west of Lima in the north-western part of Ohio. Grand Lake covers 13,500 acres (5,500 ha) in Mercer and Auglaize counties.',
    'Lake Poinsett. Home > Florida Lakes > Lake Poinsett. Lake Poinsett BASS ONLINE 2016-10-18T14:26:01+00:00. Lake Poinsett Fishing. As the St. Johns River snakes out of Lake Washington and through the lush, green marshes, it eventually forms a â\x80\x98minorâ\x80\x99 wide spot in its trace some eight miles to the North.',
    'Slavery in America began when the first African slaves were brought to the North American colony of Jamestown, Virginia, in 1619, to aid in the production of such lucrative crops as tobacco.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[23.0166, 12.0320,  1.7877]])

Evaluation

Metrics

Sparse Information Retrieval

  • Datasets: NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with SparseInformationRetrievalEvaluator
Metric NanoMSMARCO NanoNFCorpus NanoNQ NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
dot_accuracy@1 0.38 0.36 0.5 0.32 0.78 0.76 0.4 0.84 0.82 0.42 0.14 0.52 0.6939
dot_accuracy@3 0.68 0.5 0.74 0.48 0.86 0.9 0.5 0.94 1.0 0.58 0.46 0.72 0.8163
dot_accuracy@5 0.76 0.62 0.78 0.58 0.9 0.92 0.62 0.96 1.0 0.66 0.58 0.74 0.9184
dot_accuracy@10 0.82 0.7 0.82 0.72 0.96 0.96 0.74 0.98 1.0 0.82 0.74 0.84 0.9796
dot_precision@1 0.38 0.36 0.5 0.32 0.78 0.76 0.4 0.84 0.82 0.42 0.14 0.52 0.6939
dot_precision@3 0.2267 0.32 0.2533 0.1733 0.68 0.3067 0.2133 0.4733 0.3867 0.2733 0.1533 0.26 0.6054
dot_precision@5 0.152 0.328 0.16 0.14 0.608 0.192 0.168 0.32 0.252 0.228 0.116 0.164 0.5551
dot_precision@10 0.082 0.264 0.088 0.098 0.516 0.104 0.106 0.172 0.134 0.164 0.074 0.094 0.4714
dot_recall@1 0.38 0.024 0.46 0.159 0.0805 0.7167 0.2261 0.42 0.744 0.0877 0.14 0.5 0.0472
dot_recall@3 0.68 0.0792 0.69 0.2357 0.1699 0.8533 0.2985 0.71 0.938 0.1697 0.46 0.71 0.1243
dot_recall@5 0.76 0.0983 0.73 0.2957 0.2371 0.8833 0.4133 0.8 0.9687 0.2347 0.58 0.725 0.189
dot_recall@10 0.82 0.1246 0.79 0.3847 0.3389 0.9433 0.527 0.86 0.99 0.3357 0.74 0.83 0.3121
dot_ndcg@10 0.6108 0.3105 0.6426 0.325 0.634 0.8472 0.4163 0.7969 0.9198 0.326 0.4314 0.6742 0.534
dot_mrr@10 0.5427 0.4639 0.6181 0.4319 0.834 0.8369 0.4877 0.895 0.9033 0.5275 0.3336 0.6251 0.7833
dot_map@100 0.551 0.1358 0.5913 0.2561 0.4791 0.8096 0.3523 0.7206 0.8869 0.2485 0.3456 0.6257 0.394
query_active_dims 47.4 46.64 55.28 110.88 48.7 76.32 52.36 81.46 50.98 77.1 184.38 93.36 47.4898
query_sparsity_ratio 0.9984 0.9985 0.9982 0.9964 0.9984 0.9975 0.9983 0.9973 0.9983 0.9975 0.994 0.9969 0.9984
corpus_active_dims 140.1564 215.3624 160.6441 192.7664 181.0865 218.5546 134.6856 189.121 54.8137 191.1029 172.4982 216.2216 139.9833
corpus_sparsity_ratio 0.9954 0.9929 0.9947 0.9937 0.9941 0.9928 0.9956 0.9938 0.9982 0.9937 0.9943 0.9929 0.9954

Sparse Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with SparseNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ]
    }
    
Metric Value
dot_accuracy@1 0.4133
dot_accuracy@3 0.6267
dot_accuracy@5 0.7133
dot_accuracy@10 0.7867
dot_precision@1 0.4133
dot_precision@3 0.2644
dot_precision@5 0.2107
dot_precision@10 0.1447
dot_recall@1 0.2879
dot_recall@3 0.4693
dot_recall@5 0.5227
dot_recall@10 0.5778
dot_ndcg@10 0.5177
dot_mrr@10 0.5369
dot_map@100 0.4213
query_active_dims 49.6067
query_sparsity_ratio 0.9984
corpus_active_dims 164.0576
corpus_sparsity_ratio 0.9946

Sparse Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with SparseNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "climatefever",
            "dbpedia",
            "fever",
            "fiqa2018",
            "hotpotqa",
            "msmarco",
            "nfcorpus",
            "nq",
            "quoraretrieval",
            "scidocs",
            "arguana",
            "scifact",
            "touche2020"
        ]
    }
    
Metric Value
dot_accuracy@1 0.5334
dot_accuracy@3 0.7059
dot_accuracy@5 0.7722
dot_accuracy@10 0.8523
dot_precision@1 0.5334
dot_precision@3 0.3327
dot_precision@5 0.2602
dot_precision@10 0.1821
dot_recall@1 0.3065
dot_recall@3 0.4707
dot_recall@5 0.5319
dot_recall@10 0.6151
dot_ndcg@10 0.5745
dot_mrr@10 0.6371
dot_map@100 0.492
query_active_dims 74.8382
query_sparsity_ratio 0.9975
corpus_active_dims 164.4836
corpus_sparsity_ratio 0.9946

Training Details

Training Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 90,000 training samples
  • Columns: query, positive, negative, and score
  • Approximate statistics based on the first 1000 samples:
    query positive negative score
    type string string string float
    details
    • min: 4 tokens
    • mean: 9.05 tokens
    • max: 22 tokens
    • min: 17 tokens
    • mean: 79.74 tokens
    • max: 228 tokens
    • min: 14 tokens
    • mean: 77.68 tokens
    • max: 256 tokens
    • min: -3.38
    • mean: 10.51
    • max: 21.0
  • Samples:
    query positive negative score
    journal entries for standard cost variances 1 Fiber Optic, Inc., investigates all variances above 10 percent of the flexible budget. 2 The flexible budget for direct materials is $50,000. 3 The direct materials price variance is $4,000 unfavorable and the direct materials quantity variance is $(6,000) favorable. Assuming a standard price of $5 per yard, prepare a journal entry to record the purchase of raw materials for the month. 2 The company used 39,000 yards of material in production for the month, and the flexible budget shows the company expected to use 40,800 yards. In accounting the monthly close is the processing of transactions, journal entries and financial statements at the end of each month. 9.375
    what county in pana, il in? Pana /ˈpeɪnə/ is a city in Christian County, Illinois, United States. The population was 5,614 at the 2000 census. Burr Ridge, IL is currently using an area code overlay in which area codes 331 and 630 serve the same geographic area. Ten digit dialing (area code + seven digit number) is necessary. In addition to Burr Ridge, IL area code information read more about area codes 331 and 630 details and Illinois area codes. Burr Ridge, IL is located in DuPage County and observes the Central Time Zone. View our Times by Area Code tool. 13.75
    when was keep on loving you released Share this page. REO's first Top 40 appearance proved to be a fruitful one, with the group taking Keep on Loving You to the number one spot in December of 1980. Description: “If Loving You Is Wrong” is the new dramatic series created for television by writer/director Tyler Perry, premiering this fall on OWN. “If Loving You Is Wrong” is the compelling story of several women from very different walks of life.ack to the Have and The Have Nots, the scenes are too long and the characters are one dimensional. If it wasn't for Tika Sumpter, the show would be unbearable to watch. Love Thy Neighbor is the worst show ever. It is a throwback to blackface mistrel shows. 8.875
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMarginMSELoss",
        "document_regularizer_weight": 0.08,
        "query_regularizer_weight": 0.1
    }
    

Evaluation Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 10,000 evaluation samples
  • Columns: query, positive, negative, and score
  • Approximate statistics based on the first 1000 samples:
    query positive negative score
    type string string string float
    details
    • min: 4 tokens
    • mean: 9.22 tokens
    • max: 43 tokens
    • min: 19 tokens
    • mean: 80.19 tokens
    • max: 209 tokens
    • min: 14 tokens
    • mean: 77.78 tokens
    • max: 239 tokens
    • min: -9.0
    • mean: 10.74
    • max: 21.75
  • Samples:
    query positive negative score
    what trump said about obama playing golf during campaign Obama also has played golf with Woods during his presidency, though typically the president’s golf partners are personal friends and select aides, as opposed to celebrities. At a campaign rally in December 2015, Trump ripped into Obama for playing hundreds of rounds of golf as president. “He played more golf last year than Tiger Woods,” Trump said suggestively. “We don’t have time for this. We have to work.”. Trump slams Obama, Clinton for 'politically correct' war against ISIS, warns of more attacks. Republican presidential nominee Donald Trump has accused the Obama administration of waging a 'politically correct' war against the ISIS terror group and warned that more terror attacks would take place. 8.421875
    how much volume is a gram One gram is equal to 0.0353 ounces. A gram of sugar is approximately 1/4 teaspoon of sugar. A regular paper clip weighs about 1 gram. The gram and kilogram are units of mass in the metric system of measurement. The metric system was invented in France in 1799. It was improved in 1960 and named the System of International Units, or SI. Divide the object's mass by its volume. This value is the object's density and expresses it in units of mass per unit of volume. For example, for a 20-gram mass that takes up a volume of 5 cubic centimeters, the density is 4 grams per cubic centimeter.Ad.ivide the object's mass by its volume. This value is the object's density and expresses it in units of mass per unit of volume. For example, for a 20-gram mass that takes up a volume of 5 cubic centimeters, the density is 4 grams per cubic centimeter. Ad. 2.65625
    differences between the sexes sexual dimorphism in birds can be manifested in size or plumage differences between the sexes sexual size dimorphism varies among taxa with males typically being larger though this is not always the case i e birds of prey hummingbirds and some species of flightless birds Caribou are the only species of deer in which both sexes have antlers. Mature bulls can carry enormous and complex antlers, whereas cows and young animals generally have smaller and simpler ones. Mature bulls usually shed their antlers shortly after the rut whereas cows can keep theirs until spring. 10.21875
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMarginMSELoss",
        "document_regularizer_weight": 0.08,
        "query_regularizer_weight": 0.1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_dot_ndcg@10 NanoNFCorpus_dot_ndcg@10 NanoNQ_dot_ndcg@10 NanoBEIR_mean_dot_ndcg@10 NanoClimateFEVER_dot_ndcg@10 NanoDBPedia_dot_ndcg@10 NanoFEVER_dot_ndcg@10 NanoFiQA2018_dot_ndcg@10 NanoHotpotQA_dot_ndcg@10 NanoQuoraRetrieval_dot_ndcg@10 NanoSCIDOCS_dot_ndcg@10 NanoArguAna_dot_ndcg@10 NanoSciFact_dot_ndcg@10 NanoTouche2020_dot_ndcg@10
0.0178 100 576200.8 - - - - - - - - - - - - - - -
0.0356 200 2635.0334 - - - - - - - - - - - - - - -
0.0533 300 70.7781 - - - - - - - - - - - - - - -
0.0711 400 46.7365 - - - - - - - - - - - - - - -
0.0889 500 33.3391 46.8835 0.5158 0.2778 0.6192 0.4709 - - - - - - - - - -
0.1067 600 29.4815 - - - - - - - - - - - - - - -
0.1244 700 27.123 - - - - - - - - - - - - - - -
0.1422 800 22.7267 - - - - - - - - - - - - - - -
0.16 900 22.2125 - - - - - - - - - - - - - - -
0.1778 1000 23.7129 22.1341 0.5768 0.2807 0.5689 0.4754 - - - - - - - - - -
0.1956 1100 23.1061 - - - - - - - - - - - - - - -
0.2133 1200 23.3015 - - - - - - - - - - - - - - -
0.2311 1300 19.0495 - - - - - - - - - - - - - - -
0.2489 1400 20.465 - - - - - - - - - - - - - - -
0.2667 1500 19.5227 18.4953 0.5447 0.2930 0.5663 0.4680 - - - - - - - - - -
0.2844 1600 19.7019 - - - - - - - - - - - - - - -
0.3022 1700 20.2723 - - - - - - - - - - - - - - -
0.32 1800 18.644 - - - - - - - - - - - - - - -
0.3378 1900 17.8863 - - - - - - - - - - - - - - -
0.3556 2000 17.824 21.6579 0.5722 0.2951 0.5739 0.4804 - - - - - - - - - -
0.3733 2100 18.2091 - - - - - - - - - - - - - - -
0.3911 2200 17.9996 - - - - - - - - - - - - - - -
0.4089 2300 15.7506 - - - - - - - - - - - - - - -
0.4267 2400 17.8921 - - - - - - - - - - - - - - -
0.4444 2500 16.3761 20.0396 0.5493 0.2811 0.6257 0.4854 - - - - - - - - - -
0.4622 2600 18.1791 - - - - - - - - - - - - - - -
0.48 2700 15.3429 - - - - - - - - - - - - - - -
0.4978 2800 14.9936 - - - - - - - - - - - - - - -
0.5156 2900 15.364 - - - - - - - - - - - - - - -
0.5333 3000 15.6449 17.3149 0.5672 0.3030 0.6095 0.4932 - - - - - - - - - -
0.5511 3100 15.6673 - - - - - - - - - - - - - - -
0.5689 3200 15.0578 - - - - - - - - - - - - - - -
0.5867 3300 15.906 - - - - - - - - - - - - - - -
0.6044 3400 15.6495 - - - - - - - - - - - - - - -
0.6222 3500 13.6636 14.5839 0.5683 0.2978 0.6191 0.4951 - - - - - - - - - -
0.64 3600 14.7215 - - - - - - - - - - - - - - -
0.6578 3700 15.1004 - - - - - - - - - - - - - - -
0.6756 3800 13.7198 - - - - - - - - - - - - - - -
0.6933 3900 13.9975 - - - - - - - - - - - - - - -
0.7111 4000 13.5657 14.8618 0.5983 0.3042 0.6183 0.5069 - - - - - - - - - -
0.7289 4100 13.8326 - - - - - - - - - - - - - - -
0.7467 4200 14.5209 - - - - - - - - - - - - - - -
0.7644 4300 13.4064 - - - - - - - - - - - - - - -
0.7822 4400 13.7625 - - - - - - - - - - - - - - -
0.8 4500 13.2154 14.3594 0.5734 0.3266 0.6345 0.5115 - - - - - - - - - -
0.8178 4600 13.7091 - - - - - - - - - - - - - - -
0.8356 4700 12.5913 - - - - - - - - - - - - - - -
0.8533 4800 12.433 - - - - - - - - - - - - - - -
0.8711 4900 13.0404 - - - - - - - - - - - - - - -
0.8889 5000 12.409 14.0825 0.6108 0.3105 0.6426 0.5213 - - - - - - - - - -
0.9067 5100 12.4556 - - - - - - - - - - - - - - -
0.9244 5200 12.4219 - - - - - - - - - - - - - - -
0.9422 5300 12.4269 - - - - - - - - - - - - - - -
0.96 5400 12.5363 - - - - - - - - - - - - - - -
0.9778 5500 12.4979 13.8156 0.6024 0.3101 0.6405 0.5177 - - - - - - - - - -
0.9956 5600 11.9616 - - - - - - - - - - - - - - -
-1 -1 - - 0.6108 0.3105 0.6426 0.5745 0.3250 0.6340 0.8472 0.4163 0.7969 0.9198 0.3260 0.4314 0.6742 0.5340
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.13.3
  • Sentence Transformers: 4.2.0.dev0
  • Transformers: 4.53.0
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

SpladeLoss

@misc{formal2022distillationhardnegativesampling,
      title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
      author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
      year={2022},
      eprint={2205.04733},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2205.04733},
}

SparseMarginMSELoss

@misc{hofstätter2021improving,
    title={Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation},
    author={Sebastian Hofstätter and Sophia Althammer and Michael Schröder and Mete Sertkan and Allan Hanbury},
    year={2021},
    eprint={2010.02666},
    archivePrefix={arXiv},
    primaryClass={cs.IR}
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
}
Downloads last month
3
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse

Finetuned
(22)
this model

Dataset used to train arthurbresnu/splade-co-condenser-marco-msmarco-qwen3-reranker-0.6B-margin-mse

Evaluation results