SVECTOR-CORPORATION
/

Theta-35

Text Generation

Model card Files Files and versions

SVECTOR-OFFICIAL commited on Mar 6

Commit

ca136bc

·

verified ·

1 Parent(s): cd7e481

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,8 +23,8 @@ Theta-35 is the advanced reasoning model in the Theta series by SVECTOR. Compare
 **This repo contains the Theta-35 model**, which has the following features:
 - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
 - Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
-- Number of Parameters: 35B
-- Number of Parameters (Non-Embedding): 33.5B
 - Number of Layers: 64
 - Number of Attention Heads (GQA): 40 for Q and 8 for KV
 - Context Length: Full 131,072 tokens

 **This repo contains the Theta-35 model**, which has the following features:
 - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
 - Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
+- Number of Parameters: 33B
+- Number of Parameters (Non-Embedding): 33B
 - Number of Layers: 64
 - Number of Attention Heads (GQA): 40 for Q and 8 for KV
 - Context Length: Full 131,072 tokens