metadata

license: apache-2.0
datasets:
  - google/wiki40b
language:
  - zh
base_model:
  - openai-community/gpt2

Dorami

A GPT-based pretrained model using the BERT Tokenizer

Model description

Training data

google/wiki40b

Training code

dorami

How to use

1. Download model from Hugging Face Hub to local

git lfs install
git clone https://huggingface.co/lucky2me/Dorami

2. Use the model downloaded above

import torch
from transformers import AutoTokenizer,AutoModelForCausalLM
model_path = "The path of the model downloaded above"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
text = "fill in any text you like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
predicted_token_id = torch.argmax(output.logits[:, -1, :], dim=-1)
decoded_text = tokenizer.decode(predicted_token_id, skip_special_tokens=True)
print("decoded text:",decoded_text)