metadata

license: mit
language:
  - ja
pipeline_tag: text-generation

Model Card for Tanrei/GPTSAN-japanese

General-purpose Swich transformer based Japanese language model

Text Generation

>>> from transformers import AutoModel, AutoTokenizer

>>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese")
>>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
>>> x_tok = tokenizer.encode("武田信玄は、", return_tensors="pt")
>>> model = model.cuda()
>>> c = model.generate(x_tok.cuda(), max_new_tokens=50, random_seed=63)
>>> tokenizer.decode(c[0])
'武田信玄は、戦国の頃より「智勇兼備」した英雄として織田信長に比されてきた戦国武将であり、...'

Model Details

Model Description

Japanese language model using Switch Transformer. It has the same structure as the model introduced as Prefix LM in the T5 paper, and works with both Test Generation and Masked Language Model.

Developed by: Toshiyuki Sakamoto (tanreinama)
Model type: Switch Transformer
Language(s) (NLP): Japanese
License: MIT License

Model Sources

Repository: https://github.com/tanreinama/GPTSAN