DeepSeekDDM
commited on
fix the calculation of act params
Browse files- README_WEIGHTS.md +1 -1
README_WEIGHTS.md
CHANGED
@@ -18,7 +18,7 @@ The DeepSeek-V3 weight file consists of two main components: **Main Model Weight
|
|
18 |
- Input/output embedding layers and a complete set of 61 Transformer hidden layers.
|
19 |
- **Parameter Count**:
|
20 |
- Total parameters: **671B**
|
21 |
-
- Activation parameters: **36.
|
22 |
|
23 |
#### Structural Details
|
24 |
|
|
|
18 |
- Input/output embedding layers and a complete set of 61 Transformer hidden layers.
|
19 |
- **Parameter Count**:
|
20 |
- Total parameters: **671B**
|
21 |
+
- Activation parameters: **36.6B**.
|
22 |
|
23 |
#### Structural Details
|
24 |
|