Pro models [pretrain]

Bochkov 's Collections

updated Jul 11

Frozen-embedding LMs for English, Russian, Chinese; demonstration & comparison with standard LM.

Bochkov/pro_bvv_en

Text Generation • Updated Jul 14 • 15

Note This is a conceptual English language model (200M parameters) trained from scratch with **frozen**, non-semantic token embeddings, demonstrating that transformer blocks can learn semantics even when the embedding layer contains no prior meaning.
Bochkov/pro_bvv_unfrozen

Text Generation • Updated Jul 14 • 16

Note This is a baseline English language model (200M parameters) trained in the **classical** way with fully trainable token embeddings, provided for direct comparison with the conceptually frozen-embedding variant.
Bochkov/pro_bvv_ru

Text Generation • Updated Jul 14 • 16

Note Experimental 200M parameter language model jointly trained on an English-Russian (EN-RU) corpus with frozen, visually-motivated token embeddings. Designed for demonstration of cross-lingual learning without updating embeddings.
Bochkov/pro_bvv_zh

Text Generation • Updated Jul 14 • 16

Note 200M parameter EN-Chinese language model trained on mixed EN/ZH corpus using frozen, visually-motivated token embeddings (Unicode-based). Intended for demonstration of cross-lingual learning and generalization capability.
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations

Paper • 2507.04886 • Published Jul 7 • 3
Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Paper • 2507.07129 • Published Jul 8 • 3