SentenceTransformer based on law-ai/InLegalBERT

This is a sentence-transformers model finetuned from law-ai/InLegalBERT. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: law-ai/InLegalBERT
  • Maximum Sequence Length: 320 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 320, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("amixh/sentence-embedding-model-InLegalBERT-2")
# Run inference
sentences = [
    '[CONSTITUTION_ARTICLE_252] Power of Parliament to legislate for two or more States by consent and adoption of such legislation by any other State (1) If it appears to the Legislatures of two or more States to be desirable that any of the matters with respect to which Parliament has no power to make laws for the States except as provided in Articles 249 and 250 should be regulated in such States by Parliament by law, and if resolutions to that effect are passed by all the House of the Legislatures of those States, it shall be lawful for Parliament to pass an Act for regulating that matter accordingly, and any Act so passed shall apply to such States and to any other State by which it is adopted afterwards by resolution passed in that behalf by the House or, where there are two Houses, by each of the Houses of the Legislature of that State (2) Any Act so passed by Parliament may be amended or repealed by an Act of Parliament passed or adopted in like manner but shall not, as respects any State to which it applies, be amended or repealed by an Act of the Legislature of that State',
    'Power of Parliament to legislate for two or more States by consent and adoption of such legislation by any other State (1) If it appears to the Legislatures of two or more States to be desirable that any of the matters with respect to which Parliament has no power to make laws for the States except as provided in Articles 249 and 250 should be regulated in such States by Parliament by law, and if resolutions to that effect are passed by all the House of the Legislatures of those States, it shall be lawful for Parliament to pass an Act for regulating that matter accordingly, and any Act so passed shall apply to such States and to any other State by which it is adopted afterwards by resolution passed in that behalf by the House or, where there are two Houses, by each of the Houses of the Legislature of that State (2) Any Act so passed by Parliament may be amended or repealed by an Act of Parliament passed or adopted in like manner but shall not, as respects any State to which it applies, be amended or repealed by an Act of the Legislature of that State',
    '[CRPC_SECTION_206] Section 206, If, in the opinion of a Magistrate taking cognizance of a petty offence, the case may be summarily disposed of under section 260 or section 261, the Magistrate shall, except where he is, for reasons to be recorded in writing of a contrary opinion, issue summons to the accused requiring him either to appear in person or by pleader before the Magistrate on a specified date, or if he desires to plead guilty to the charge without appearing before the Magistrate, to transmit before the specified date, by post or by messenger to the Magistrate, the said plea in writing and the amount of fine specified in the summons or if he desires to appear by pleader and to plead guilty to the charge through such pleader, to authorise, in writing, the pleader to plead guilty to the charge on his behalf and to pay the fine through such pleader; Provided that the amount of the fine specified in such summons shall not exceed one thousand rupees. For the purposes of this section, “petty offence” means any offence punishable only with fine not exceeding one thousand rupees, but does not include any offence so punishable under the Motor Vehicles Act, 1931, or under any other law which provides for convicting the accused person in his absence on a plea of guilty. The State Government may, by notification, specially empower any Magistrate to exercise the powers conferred by Sub-Section (1) in relation to any offence which is compoundable under section 320 or any offence punishable with imprisonment for a term not exceeding three months, or with fine or with both where the Magistrate is of opinion that, having regard to the facts and circumstances of the case, the imposition of fine only would meet the ends of justice.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,788 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 14 tokens
    • mean: 138.36 tokens
    • max: 320 tokens
    • min: 5 tokens
    • mean: 130.74 tokens
    • max: 320 tokens
    • min: 14 tokens
    • mean: 138.37 tokens
    • max: 320 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    [IPC_SECTION_395] According to Whoever commits dacoity shall be punished with imprisonment for life, or with rigorous imprisonment for a term which may extend to ten years, and shall also be liable to fine. IPC 395 in Simple Words Whoever commits dacoity shall be punished with either life imprisonment or rigorous imprisonment up to ten years, and may also face a fine. According to Whoever commits dacoity shall be punished with imprisonment for life, or with rigorous imprisonment for a term which may extend to ten years, and shall also be liable to fine. IPC 395 in Simple Words Whoever commits dacoity shall be punished with either life imprisonment or rigorous imprisonment up to ten years, and may also face a fine. [CONSTITUTION_ARTICLE_293] Borrowing by States (1) Subject to the provisions of this article, the executive power of a State extends to borrowing within the territory of India upon the security of the Consolidated Fund of the State within such limits, if any, as may from time to time be fixed by the Legislature of such State by law and to the giving of guarantees within such limits, if any, as may be so fixed (2) The Government of India may, subject to such conditions as may be laid down by or under any law made by Parliament, make loans to any State or, so long as any limits fixed under Article 292 are not exceeded, give guarantees in respect of loans raised by any State, and any sums required for the purpose of making such loans shall be charged on the Consolidated Fund of India (3) A State may not without the consent of the Government of India raise any loan if there is still outstanding any part of a loan which has been made to the State by the Government of India or by its predece...
    [IPC_SECTION_344] According to Whoever wrongfully confines any person for ten days, or more, shall be punished with imprisonment of either description for a term which may extend to three years, and shall also be liable to fine. IPC 344 in Simple Words Section 344 of the states that anyone who wrongfully confines a person for ten days or more can be punished with imprisonment for up to three years and may also be fined. According to Whoever wrongfully confines any person for ten days, or more, shall be punished with imprisonment of either description for a term which may extend to three years, and shall also be liable to fine. IPC 344 in Simple Words Section 344 of the states that anyone who wrongfully confines a person for ten days or more can be punished with imprisonment for up to three years and may also be fined. [CRPC_SECTION_296] Section 296, The evidence of any person whose evidence is of a formal character may be given by affidavit and may, subject to all just exceptions, be read in evidence in any inquiry, trial or other proceeding under this Code. The Court may, if it thinks fit, and shall, on the application of the prosecution or the accused, summon and examine any such person as to the facts contained in his affidavit.
    [CRPC_SECTION_263] Section 263, In every case tried summarily, the Magistrate shall enter, in such form as the Stale Government may direct, the following particulars, namely— the serial number of the case; the date of the commission of the offence; the date of the report of complaint; the name of the complainant (if any); the name, parentage and residence of the accused; the offence complained of and the offence (if any) proved, and in cases coming under clause (ii), clause (iii) or clause (iv) of Sub-Section (1) of section 260, the value of the property in respect of which the offence has been committed; the plea of the accused and his examination (if any); the finding; the sentence or other final order; the date on which proceedings terminated. Section 263, In every case tried summarily, the Magistrate shall enter, in such form as the Stale Government may direct, the following particulars, namely— the serial number of the case; the date of the commission of the offence; the date of the report of complaint; the name of the complainant (if any); the name, parentage and residence of the accused; the offence complained of and the offence (if any) proved, and in cases coming under clause (ii), clause (iii) or clause (iv) of Sub-Section (1) of section 260, the value of the property in respect of which the offence has been committed; the plea of the accused and his examination (if any); the finding; the sentence or other final order; the date on which proceedings terminated. [CRPC_SECTION_342] Section 342, Any Court dealing with an application made to it for filing a complaint under section 340 or an appeal under section 341, shall have power to make such order as to costs as may be just.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
1
Safetensors
Model size
112M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amixh/sentence-embedding-model-InLegalBERT-2

Base model

law-ai/InLegalBERT
Finetuned
(9)
this model