YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
bengali-t5-base
bengali-t5-base is a model trained on the Bengali portion of MT5 dataset. We used the T5-base
model for this model.
Flax/Jax Community Week, organized by HuggingFace and TPU usage sponsored by Google.
The model is trained on around ~11B tokens (64 size batch, 512 tokens, 350k steps).
load tokenizer
>>> tokenizer = transformers.AutoTokenizer.from_pretrained("flax-community/bengali-t5-base")
>>> tokenizer.encode("আমি বাংলার গান গাই")
>>> tokenizer.decode([93, 1912, 814, 5995, 3, 1])
[93, 1912, 814, 5995, 3, 1]
'আমি বাংলার গান গাই </s>'
load model
>>> config = T5Config.from_pretrained("flax-community/bengali-t5-base")
>>> model = FlaxT5ForConditionalGeneration.from_pretrained("flax-community/bengali-t5-base", config=config)
The model is trained on de-noising
objectives followed by the script here and here. Currently This model doesn't have any generation capability. If you want this model to have generation capability, please do a finetuning on prefix-LM
objective mentioned in the paper.
See the tensorboard log in Training metrics
tab.
Please note that we haven't finetuned the model in any downstream task.
Proposal
Participants
Useful links
- Downloads last month
- 32
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.