Pro models [pretrain]
Frozen-embedding LMs for English, Russian, Chinese; demonstration & comparison with standard LM.
Text Generation • Updated • 27Note This is a conceptual English language model (200M parameters) trained from scratch with **frozen**, non-semantic token embeddings, demonstrating that transformer blocks can learn semantics even when the embedding layer contains no prior meaning.
Bochkov/pro_bvv_unfrozen
Text Generation • Updated • 22Note This is a baseline English language model (200M parameters) trained in the **classical** way with fully trainable token embeddings, provided for direct comparison with the conceptually frozen-embedding variant.
Bochkov/pro_bvv_ru
Text Generation • Updated • 13Note Experimental 200M parameter language model jointly trained on an English-Russian (EN-RU) corpus with frozen, visually-motivated token embeddings. Designed for demonstration of cross-lingual learning without updating embeddings.
Bochkov/pro_bvv_zh
Text Generation • Updated • 11Note 200M parameter EN-Chinese language model trained on mixed EN/ZH corpus using frozen, visually-motivated token embeddings (Unicode-based). Intended for demonstration of cross-lingual learning and generalization capability.
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations
Paper • 2507.04886 • Published • 1Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate
Paper • 2507.07129 • Published • 2