DarwinAnim8or commited on
Commit
cb265aa
·
1 Parent(s): 2eec48f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -5,11 +5,21 @@ license: other
5
 
6
  This is a text generation model based on the [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b) model from Meta, trained using the Deepspeed library. The model can generate natural and engaging conversational responses given a user input.
7
 
8
- ## Model Details
9
 
10
  - The base model is [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b), a decoder-only transformer with 1.3 billion parameters, pre-trained on a large text corpus using the causal language modeling objective.
11
  - The model was trained on a single NVIDIA A100 GPU using the Deepspeed pipeline parallelism and ZeRO optimizer.
12
 
 
 
 
 
 
 
 
 
 
 
13
  ## Usage
14
 
15
  You can use this model directly with the Hugging Face pipeline for text generation:
 
5
 
6
  This is a text generation model based on the [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b) model from Meta, trained using the Deepspeed library. The model can generate natural and engaging conversational responses given a user input.
7
 
8
+ ## Training Details
9
 
10
  - The base model is [OPT-1.3B](https://huggingface.co/facebook/opt-1.3b), a decoder-only transformer with 1.3 billion parameters, pre-trained on a large text corpus using the causal language modeling objective.
11
  - The model was trained on a single NVIDIA A100 GPU using the Deepspeed pipeline parallelism and ZeRO optimizer.
12
 
13
+ ## Model Details
14
+ - Number of parameters: 1.3 billion
15
+ - Number of layers: 24
16
+ - Number of attention heads: 16
17
+ - Context size: 2048
18
+ - Vocabulary size: 50,265
19
+ - Embedding size: 1280
20
+ - Feed-forward size: 5120
21
+ - Dropout rate: 0.1
22
+
23
  ## Usage
24
 
25
  You can use this model directly with the Hugging Face pipeline for text generation: