|
---
|
|
license: apache-2.0
|
|
datasets:
|
|
- google/wiki40b
|
|
language:
|
|
- zh
|
|
base_model:
|
|
- openai-community/gpt2
|
|
---
|
|
|
|
# Dorami
|
|
|
|
A GPT-based pretrained model using the BERT Tokenizer
|
|
|
|
## Model description
|
|
|
|
### Training data
|
|
|
|
[google/wiki40b](https://huggingface.co/datasets/google/wiki40b)
|
|
|
|
### Training code
|
|
|
|
[dorami](https://github.com/6zeus/dorami.git)
|
|
|
|
## How to use
|
|
|
|
### 1. Download model from Hugging Face Hub to local
|
|
|
|
```
|
|
git lfs install
|
|
git clone https://huggingface.co/lucky2me/Dorami
|
|
```
|
|
|
|
### 2. Use the model downloaded above
|
|
```python
|
|
import torch
|
|
from transformers import AutoTokenizer,AutoModelForCausalLM
|
|
model_path = "The path of the model downloaded above"
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
|
model = AutoModelForCausalLM.from_pretrained(model_path)
|
|
text = "fill in any text you like."
|
|
encoded_input = tokenizer(text, return_tensors='pt')
|
|
output = model(**encoded_input)
|
|
predicted_token_id = torch.argmax(output.logits[:, -1, :], dim=-1)
|
|
decoded_text = tokenizer.decode(predicted_token_id, skip_special_tokens=True)
|
|
print("decoded text:",decoded_text)
|
|
``` |