SentenceTransformer based on nomic-ai/modernbert-embed-base
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the touch-rugby-modernbert-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nomic-ai/modernbert-embed-base
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dujun/modernbert-embed-base-dj-ft-v2")
# Run inference
sentences = [
'When does a player cease to be the Half?',
'13.10\tA player ceases to be the Half once the ball is passed to another player.13.11\tDefending players are not to interfere with the performance of the Rollball or the \nHalf.Ruling = A Penalty to the Attacking Team at a point ten (10) metres directly Forward of the \nInfringement.13.12\tPlayers of the Defending Team must not move Forward of the Onside position \nuntil the Half has made contact with the ball, unless directed to do so by the \nReferee or in accordance with 13.12.1.13.12.1\tWhen the Half is not within one (1) metre of the Rollball, Onside players \nof the Defending Team may move Forward as soon as the player \nperforming the Rollball releases the ball.If the Half is not in position and \na defending player moves Forward and makes contact with the ball, a \nChange of Possession results.',
'18.7\tA player may perform a Rollball instead of a Penalty Tap and the player who \nreceives the ball does not become the Half.18.8\tIf the Defending Team is penalised three (3) times upon entering their Seven \nMetre Zone during a single Possession, the last offending player will be given an \nExclusion until the end of that Possession.18.9\tA Penalty Try is awarded if any action by a player, Team official or spectator, \ndeemed by the Referee to be contrary to the Rules or spirit of the game clearly \nprevents the Attacking Team from scoring a Try.FIT Playing Rules - 5th Edition\nCOPYRIGHT © Touch Football Australia 2020\n15\n19\u2002 Advantage \n19.1\tWhere a Defending Team player is Offside at a Tap or Rollball and attempts \nto interfere with play, the Referee will allow Advantage or award a Penalty, \nwhichever is of greater Advantage to the Attacking Team.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
touch-rugby-modernbert-pairs
- Dataset: touch-rugby-modernbert-pairs at 7cb0ae2
- Size: 274 training samples
- Columns:
question
andrelated_chunk
- Approximate statistics based on the first 274 samples:
question related_chunk type string string details - min: 10 tokens
- mean: 18.51 tokens
- max: 36 tokens
- min: 147 tokens
- mean: 230.74 tokens
- max: 319 tokens
- Samples:
question related_chunk Where does a change of possession occur if a touch is made in In-Goal?
Ruling = A Penalty to the non-offending Team at the point of the Infringement.10.4 If the ball is accidentally knocked from the hands of a player in Possession
during a Touch, the Touch counts and the Attacking Team retains Possession.10.5 The defending player must not deliberately knock the ball from the hands of a
player in Possession during a Touch.Ruling = A Penalty to the Attacking Team at the point of the Infringement.10.6 A player must not pass or otherwise deliver the ball after a Touch has been
made.Ruling = A Penalty to the Defending Team at the point of the Infringement, or if In-Goal the
nearest point on the seven (7) metre line.10.7 The Half may pass or run with the ball but cannot get Touched while in
Possession of the ball.Ruling = A Change of Possession occurs at the point of the Touch, or if In-Goal the nearest
point on the seven (7) metre line.What section details the field of play in the Touch Rugby Rules 5th Edition?
FIT Playing Rules - 5th Edition
COPYRIGHT © Touch Football Australia 2020
Appendix 1 – Field of Play
Contents
01 I
The Field of Play
5
02 I
Player Registration
5
03 I
The Ball
6
04 I
Playing Uniform
6
05 I
Team Composition
6
06 I
Team Coach and Team Officials
7
07
I
Commencement and Recommencement of Play
7
08
I
Match Duration
8
09 I
Possession
8
10
I
The Touch
9
11
I
Passing
10
12
I
Ball Touched in Flight
10
13
I
The Rollball
11
14
I
Scoring
13
15
I
Offside
13
16
I
Obstruction
14
17
I
Interchange
14
18
I
Penalty
15
19
I
Advantage
16
20
I
Misconduct
16
21
I
Forced Interchange
16
22
I
Sin Bin
16
23
I
Dismissal
17What is one of the Referee's responsibilities before the match commences?
An approach may only be made during a break in play or at
the discretion of the Referee.FIT Playing Rules - 5th Edition
18
COPYRIGHT © Touch Football Australia 2020
HALFWAY LINE
SIN BIN AREAS
IN-GOAL AREA
TRY LINE
7 M ZONE
DEAD BALL LINE
PERIMETER
INTERCHANGE
AREA
20M
10M
10M
1M
5M
7 M
7 M
7 M
7 M
50M
3M
70M
INTERCHANGE
AREA
Appendix 1 – Field of Play
FIT Playing Rules - 5th Edition
COPYRIGHT © Touch Football Australia 2020
19
FEDERATION OF INTERNATIONAL TOUCH - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
touch-rugby-modernbert-pairs
- Dataset: touch-rugby-modernbert-pairs at 7cb0ae2
- Size: 31 evaluation samples
- Columns:
question
andrelated_chunk
- Approximate statistics based on the first 31 samples:
question related_chunk type string string details - min: 12 tokens
- mean: 19.52 tokens
- max: 33 tokens
- min: 170 tokens
- mean: 234.45 tokens
- max: 304 tokens
- Samples:
question related_chunk Where must a player's identifying number be displayed?
3.2 The ball shall be inflated to the manufacturers’ recommended air pressure.3.3 The Referee shall immediately pause the match if the size and shape of the ball
no longer complies with clauses 3.1 or 3.2 to allow for the ball to replaced or the
issue rectified.3.4 The ball must not be hidden under player attire.4 Playing Uniform
4.1 Participating players are to be correctly attired in matching Team uniforms
4.2 Playing uniforms consist of shirt, singlet or other item as approved by the NTA
or NTA competition provider, shorts and/or tights and socks.4.3 All players are to wear a unique identifying number not less than 16cm in height,
clearly displayed on the rear of the playing top.4.3.1 Identifying numbers must feature no more than two (2) digits.4.4 Hats or caps are permitted to be worn during a match provided they are safe
and meet any NTA regulations.Besides penalties, what other consequences might result from continuous rule breaches?
20 Misconduct
20.1 Misconduct warranting Penalty, Forced Interchange, Sin Bin or Dismissal
includes:
20.1.1 Continuous or regular breaches of the Rules;
20.1.2 Swearing towards another player, Referee, spectator or other match
official;
20.1.3 Disputing decisions of Referees or other match official(s);
20.1.4 Using more than the necessary physical force to make a Touch;
20.1.5 Poor sportsmanship;
20.1.6 Tripping, striking, or otherwise assaulting another player, Referee,
spectator or other match official; or
20.1.7 Any other action that is contrary to the spirit of the game.21 Forced Interchange
21.1 Where the Referee deems it necessary to implement a Forced Interchange
following an Infringement, the Referee is to stop the match, direct the ball to
be placed on the Mark, advise the offending player of the reason for the Forced
Interchange, direct that player to return to the Interchange Area, display the
relevant signal and award a Penalty to the non-offending Team.Can a Rollball be performed after a Touch has been made?
Ruling = A Penalty to the Defending Team at the point of the Infringement.13.5 A player may only perform a Rollball at the Mark under the following
circumstances:
13.5.1 when a Touch has been made; or
13.5.2 when Possession changes following the sixth Touch; or
13.5.3 when Possession changes due to the ball being dropped or passed and
goes to the ground; or
13.5.4 when Possession changes due to an Infringement by an attacking player
at a Penalty, a Tap or a Rollball; or
FIT Playing Rules - 5th Edition
COPYRIGHT © Touch Football Australia 2020
11
13.5.5 when Possession changes after the Half is Touched or when the Half
places the ball on or over the Try Line; or
13.5.6 in replacement of a Penalty Tap; or
13.5.7 when so directed by the Referee. - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 32per_device_eval_batch_size
: 32learning_rate
: 5e-06num_train_epochs
: 1lr_scheduler_type
: constantwarmup_ratio
: 0.3
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-06weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: constantlr_scheduler_kwargs
: {}warmup_ratio
: 0.3warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size
: 0fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.2 | 1 | - | 2.7507 |
0.4 | 2 | 3.6185 | 2.7254 |
0.6 | 3 | - | 2.7059 |
0.8 | 4 | 3.4585 | 2.6828 |
1.0 | 5 | - | 2.6653 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 4.1.0
- Transformers: 4.51.3
- PyTorch: 2.7.0+cu126
- Accelerate: 1.7.0
- Datasets: 2.17.1
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for dujun/modernbert-embed-base-dj-ft-v2
Base model
answerdotai/ModernBERT-base
Finetuned
nomic-ai/modernbert-embed-base