Aananda-giri commited on
Commit
cbb883f
·
verified ·
1 Parent(s): 9273746

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -6
README.md CHANGED
@@ -5,11 +5,101 @@ tags:
5
  - pytorch_model_hub_mixin
6
  ---
7
 
8
- # GPT2 Nepali 124M model.
9
 
10
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
11
- - Library: https://huggingface.co/Aananda-giri/GPT2-Nepali/
12
- - Docs: [More Information Needed]
13
 
14
- * [Code (github)](https://github.com/Aananda-giri/GPT2-Nepali)
15
- * [chat interface (huggingface-space)](https://huggingface.co/spaces/Aananda-giri/gpt2-nepali)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - pytorch_model_hub_mixin
6
  ---
7
 
8
+ # GPT2 Nepali 124M Base Model.
9
 
10
+ Welcome to the **GPT2-Nepali** repository! This project features a GPT-2 model trained from scratch on a 12GB Nepali text dataset derived from the [NepBERTa project](https://nepberta.github.io). The model is specifically tailored for the Nepali language and includes a user-friendly chat interface hosted on Hugging Face Spaces.
 
 
11
 
12
+ ---
13
+
14
+ ## Project Highlights
15
+
16
+ - **Chat Interface:**
17
+ [Hugging-Face-Space](https://huggingface.co/spaces/Aananda-giri/gpt2-nepali)
18
+
19
+ - **Training Code:**
20
+ [GitHub Repository](https://github.com/Aananda-giri/GPT2-Nepali)
21
+
22
+ - **Dataset:**
23
+ 12GB Nepali text extracted from the [NepBERTa project](https://nepberta.github.io)
24
+
25
+ ---
26
+
27
+ ## Overview
28
+
29
+ **GPT2-Nepali** adapts the GPT-2 model training process (inspired by the resource [Build a Large Language Model (From Scratch)](https://www.manning.com/books/build-a-large-language-model-from-scratch)) to address the nuances of the Nepali language. Key modifications include the development of a dedicated BPE tokenizer for Nepali and adjustments to the dataloader to better handle pre-tokenized datasets.
30
+
31
+ ---
32
+
33
+ ## Installation
34
+
35
+ * Clone the repository and install the required dependencies:
36
+
37
+ ```bash
38
+ git clone https://github.com/Aananda-giri/GPT2-Nepali.git
39
+ cd GPT2-Nepali
40
+ pip install -r requirements.txt
41
+ ```
42
+
43
+ * download `gpt_model_code.py`
44
+ ```python
45
+ import requests
46
+ res=requests.get(r"https://raw.githubusercontent.com/Aananda-giri/GPT2-Nepali/main/3.%20GPT2-Nepali/2_inference/gpt_model_code.py")
47
+ with open('gpt_model_code.py','w') as f:
48
+ f.write(res.text)
49
+ ```
50
+
51
+ ---
52
+
53
+ ## Quick Start
54
+
55
+ Below is a sample script to load the model and generate text:
56
+
57
+ ```python
58
+ from transformers import PreTrainedTokenizerFast
59
+ import torch
60
+ from gpt_model_code import GPTModel, GPT_CONFIG_124M, generate # Use model_code if applicable
61
+
62
+ # Determine the device
63
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
64
+
65
+ # Initialize and load the model
66
+ model = GPTModel(GPT_CONFIG_124M)
67
+ model.to(device)
68
+
69
+ # Load the pre-trained model from Hugging Face
70
+ model = GPTModel.from_pretrained("Aananda-giri/GPT2-Nepali")
71
+ model.to(device)
72
+
73
+ # Load the tokenizer
74
+ tokenizer = PreTrainedTokenizerFast.from_pretrained("Aananda-giri/GPT2-Nepali")
75
+
76
+ # Generate sample text
77
+ prompt = "रामले भात"
78
+ generated_text = generate(
79
+ model,
80
+ prompt,
81
+ tokenizer,
82
+ max_new_tokens=100,
83
+ temperature=0.7,
84
+ top_k=50,
85
+ top_p=None, # Use nucleus sampling if needed
86
+ eos_id=None,
87
+ repetition_penalty=1.2,
88
+ penalize_len_below=50
89
+ )
90
+
91
+ print(generated_text)
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Acknowledgments
97
+
98
+ A special thank you to [@rasbt](https://twitter.com/rasbt) for the inspiration and for authoring *Build a Large Language Model (From Scratch)*—one of the best resources on LLMs available!
99
+
100
+ ---
101
+ ---
102
+
103
+ Happy-Coding!
104
+
105
+ ---