File size: 1,127 Bytes
f3b6a29 ec3279b f3b6a29 ec3279b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
---
license: apache-2.0
datasets:
- google/wiki40b
language:
- zh
base_model:
- openai-community/gpt2
---
# Dorami
A GPT-based pretrained model using the BERT Tokenizer
## Model description
### Training data
[google/wiki40b](https://huggingface.co/datasets/google/wiki40b)
### Training code
[dorami](https://github.com/6zeus/dorami.git)
## How to use
### 1. Download model from Hugging Face Hub to local
```
git lfs install
git clone https://huggingface.co/lucky2me/Dorami
```
### 2. Use the model downloaded above
```python
import torch
from transformers import AutoTokenizer,AutoModelForCausalLM
model_path = "The path of the model downloaded above"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
text = "fill in any text you like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
predicted_token_id = torch.argmax(output.logits[:, -1, :], dim=-1)
decoded_text = tokenizer.decode(predicted_token_id, skip_special_tokens=True)
print("decoded text:",decoded_text)
``` |