metadata
license: mit
language:
- ja
pipeline_tag: text-generation
Model Card for Tanrei/GPTSAN-japanese
General-purpose Swich transformer based Japanese language model
Text Generation
>>> from transformers import AutoModel, AutoTokenizer
>>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese")
>>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
>>> x_tok = tokenizer.encode("武田信玄は、", return_tensors="pt")
>>> model = model.cuda()
>>> c = model.generate(x_tok.cuda(), max_new_tokens=50, random_seed=63)
>>> tokenizer.decode(c[0])
'武田信玄は、戦国の頃より「智勇兼備」した英雄として織田信長に比されてきた戦国武将であり、...'
Model Details
Model Description
Japanese language model using Switch Transformer.
It has the same structure as the model introduced as Prefix LM
in the T5 paper, and works with both Test Generation and Masked Language Model.
- Developed by: Toshiyuki Sakamoto (tanreinama)
- Model type: Switch Transformer
- Language(s) (NLP): Japanese
- License: MIT License
Model Sources
- Repository: https://github.com/tanreinama/GPTSAN