File size: 1,127 Bytes
f3b6a29
 
ec3279b
 
 
 
 
 
f3b6a29
ec3279b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---

license: apache-2.0
datasets:
- google/wiki40b
language:
- zh
base_model:
- openai-community/gpt2
---


# Dorami

A GPT-based pretrained model using the BERT Tokenizer

## Model description

### Training data

[google/wiki40b](https://huggingface.co/datasets/google/wiki40b)

### Training code

[dorami](https://github.com/6zeus/dorami.git)

## How to use

### 1. Download model from Hugging Face Hub to local

``` 

git lfs install

git clone https://huggingface.co/lucky2me/Dorami

```

### 2. Use the model downloaded above 
```python

import torch

from transformers import AutoTokenizer,AutoModelForCausalLM

model_path = "The path of the model downloaded above"

tokenizer = AutoTokenizer.from_pretrained(model_path)

model = AutoModelForCausalLM.from_pretrained(model_path)

text = "fill in any text you like."

encoded_input = tokenizer(text, return_tensors='pt')

output = model(**encoded_input)

predicted_token_id = torch.argmax(output.logits[:, -1, :], dim=-1)

decoded_text = tokenizer.decode(predicted_token_id, skip_special_tokens=True)

print("decoded text:",decoded_text)

```