SVECTOR-OFFICIAL commited on
Commit
ca136bc
·
verified ·
1 Parent(s): cd7e481

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -23,8 +23,8 @@ Theta-35 is the advanced reasoning model in the Theta series by SVECTOR. Compare
23
  **This repo contains the Theta-35 model**, which has the following features:
24
  - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
25
  - Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
26
- - Number of Parameters: 35B
27
- - Number of Parameters (Non-Embedding): 33.5B
28
  - Number of Layers: 64
29
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
30
  - Context Length: Full 131,072 tokens
 
23
  **This repo contains the Theta-35 model**, which has the following features:
24
  - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
25
  - Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
26
+ - Number of Parameters: 33B
27
+ - Number of Parameters (Non-Embedding): 33B
28
  - Number of Layers: 64
29
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
30
  - Context Length: Full 131,072 tokens