Bochkov
/

abs-bvv-2

Text Generation

feature-extraction

progressive-growth

constructive-learning

frozen-embeddings

Model card Files Files and versions Community

Bochkov commited on Jul 14

Commit

cbe258a

·

verified ·

1 Parent(s): 57116dc

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -16,6 +16,7 @@ tags:
 ## Model Description
 `abs-bvv-2` is a 1.5 billion parameter decoder-only Transformer model. It is the second model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
 This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
@@ -26,7 +27,7 @@ The core idea is to demonstrate an alternative, more modular and resource-effici
 `abs-bvv-2` represents the state of the model after 2 layers of progressive training. It has 2 Transformer blocks, a hidden dimension of 4096, and uses the `bvv241` tokenizer family.
-**Code:** [https://github.com/Bochkov/bvv241](https://github.com/Bochkov/bvv241)
 ## Intended Use

 ## Model Description
 `abs-bvv-2` is a 1.5 billion parameter decoder-only Transformer model. It is the second model in the **Progressive Growth Transformers (PGT)** series, designed to explore how linguistic and reasoning capabilities emerge as a function of model depth.
+This model is presented in the paper [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129).
 This model was not trained monolithically. Instead, it was "grown" constructively, one layer at a time, upon a foundation of **frozen, non-semantic visual embeddings**, as introduced in the paper "[Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations](https://arxiv.org/abs/2507.04886)".
 `abs-bvv-2` represents the state of the model after 2 layers of progressive training. It has 2 Transformer blocks, a hidden dimension of 4096, and uses the `bvv241` tokenizer family.
+**Code:** [https://github.com/AVBochkov/PGT](https://github.com/AVBochkov/PGT)
 ## Intended Use