SRDberta

This is a BERT model trained for Masked Language Modeling for English Data.

Dataset

Hinglish-Top Dataset columns

  • en_query
  • cs_query
  • en_parse
  • cs_parse
  • domain

Training

Epoch Loss
1 0.0485
2 0.00837
3 0.00812
4 0.0029
5 0.014
6 0.00748
7 0.0041
8 0.00543
9 0.00304
10 0.000574

Inference

from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("SRDdev/SRDBerta")

model = AutoModelForMaskedLM.from_pretrained("SRDdev/SRDBerta")

fill = pipeline('fill-mask', model='SRDberta', tokenizer='SRDberta')
fill_mask = fill.tokenizer.mask_token
fill(f'Aap {fill_mask} ho?')

Citation

Author: @SRDdev

Name : Shreyas Dixit
framework : Pytorch
Year: Jan 2023
Pipeline : fill-mask
Github : https://github.com/SRDdev
LinkedIn : https://www.linkedin.com/in/srddev/ 
Downloads last month
25
Safetensors
Model size
67M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train SRDdev/MaskedLM