Taishi-N324 commited on
Commit
1ede338
·
verified ·
1 Parent(s): 647ed31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -10
README.md CHANGED
@@ -72,16 +72,18 @@ print(tokenizer.decode(output))
72
  - **Model type:** Transformer-based Language Model
73
  - **Total seen tokens:** 2.1T tokens
74
 
75
- |Params|Layers|Hidden size|Heads|Context length|Embedding parameters|Non-embedding parameters|
76
- |:---:|:---:|:---:|:---:|:---:|:---:|:---:|
77
- |150M|12|512|8|4096|101,874,688|50,344,448|
78
- |440M|16|1024|8|4096|203,749,376|243,303,424|
79
- |980M|20|1536|8|4096|305,624,064|684,258,816|
80
- |1.8b|24|2048|16|4096|407,498,752|1,459,718,144|
81
- |3.7b|28|3072|24|4096|611,248,128|3,171,068,928|
82
- |7.2b|32|4096|32|4096|814,997,504|6,476,271,616|
83
- |13b|40|5120|40|4096|1,018,746,880|12,688,184,320|
84
- |172b|96|12288|96|4096|2,444,992,512|169,947,181,056|
 
 
85
 
86
  ## Tokenizer
87
 
 
72
  - **Model type:** Transformer-based Language Model
73
  - **Total seen tokens:** 2.1T tokens
74
 
75
+ |Params|Layers|Hidden size|Heads|Routed Experts|Activated Experts|Context length|Embedding parameters|Non-embedding parameters|Activated parameters|Total parameters|
76
+ |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
77
+ |150M|12|512|8|-|-|4096|101,874,688|50,344,448|152,219,136|152,219,136|
78
+ |440M|16|1024|8|-|-|4096|203,749,376|243,303,424|447,052,800|447,052,800|
79
+ |980M|20|1536|8|-|-|4096|305,624,064|684,258,816|989,882,880|989,882,880|
80
+ |1.8b|24|2048|16|-|-|4096|407,498,752|1,459,718,144|1,867,216,896|1,867,216,896|
81
+ |8x1.8b|24|2048|16|8|2|4096|407,498,752|8,858,863,616|2,924,279,808|9,266,362,368|9,266,362,368|
82
+ |3.7b|28|3072|24|-|-|4096|611,248,128|3,171,068,928|3,782,317,056|3,782,317,056|
83
+ |7.2b|32|4096|32|-|-|4096|814,997,504|6,476,271,616|7,291,269,120|7,291,269,120|
84
+ |13b|40|5120|40|-|-|4096|1,018,746,880|12,688,184,320|13,706,931,200|13,706,931,200|
85
+ |8x13b|40|5120|40|8|2|4096|1,018,746,880|72,144,081,920|22,200,806,400|73,162,828,800|
86
+ |172b|96|12288|96|-|-|4096|2,444,992,512|169,947,181,056|172,392,173,568|172,392,173,568|
87
 
88
  ## Tokenizer
89