You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for CrystaLLM-pi_bandgap

Model Details

Model Description

CrystaLLM-pi_bandgap is a conditional generative model designed for the inverse design of inorganic crystalline materials. It is a fine-tuned version of the CrystaLLM-pi framework, based on a GPT-2 decoder-only architecture. This specific variant employs the Property-Key-Value (PKV) attention mechanism (referred to as "Prefix attention" in the associated preprint) to condition the generation of Crystallographic Information Files (CIFs) on specific electronic and thermodynamic properties.

The model generates crystal structures (cell parameters and atomic positions) based on two target scalar properties:

  1. Band gap (eV)
  2. Energy above convex hull ($E_{hull}$, eV/atom) - a proxy for thermodynamic stability
  • Developed by: Bone et al. (University College London)
  • Model type: Autoregressive Transformer with Prefix Attention Conditioning
  • Language(s): CIF (Crystallographic Information File) syntax
  • License: MIT
  • Finetuned from model: c-bone/CrystaLLM-pi_base

Model Sources

Uses

Direct Use

The model is intended for research in materials science, specifically for the exploration of chemical space targeting specific electronic properties. Users can input a desired band gap and a stability criterion to generate candidate crystal structures.

Out-of-Scope Use

  • Organic Materials: The model was trained exclusively on inorganic crystal structures.
  • Large Unit Cells: Due to the context window limit of 1024 tokens, the model cannot reliably generate unit cells containing more than approximately 20 atoms.
  • Disordered Systems: The model currently generates ordered structures and does not natively handle partial occupancies.
  • Production Deployment: This is a research artifact. Generated structures must be validated via Density Functional Theory (DFT) or other simulation methods before synthesis attempts.

Bias, Risks, and Limitations

  • Training Distribution Bias: The model is trained on the Materials Project database. It exhibits higher performance in regions of chemical space well-represented in the training data (e.g., band gaps near 0 eV). Performance degrades in sparse regions of the property manifold.
  • Validity: As an autoregressive language model, it may generate syntactically incorrect CIFs or chemically implausible structures. Post-processing validation is required.
  • Hallucination: The model may generate "novel" compositions that are thermodynamically unstable.

How to Get Started with the Model

For instructions on how to load and run generation with this model, please refer to the _load_and_generate.py script in the CrystaLLM-pi GitHub Repository. This script handles the necessary tokenization, property normalization, and prompt construction required to properly condition the model

Training Details

Training Data

The model was fine-tuned on the MP Bandgap dataset, a subset of the Materials Project containing approximately 53.3K inorganic structures labeled with PBE band gaps and $E_{hull}$ values.

  • Source: Materials Project (via c-bone/mpdb-2prop_clean)
  • Preprocessing: CIFs are augmented, tokenized, and property values are normalized before injection into the attention mechanism.

Training Procedure

  • Architecture: GPT-2 Small with additional Property-Key-Value (PKV) encoder layers. (~61.6M parameters)
  • Mechanism: Continuous property values are projected into the attention mechanism's key-value space (Prefix Tuning), allowing the model to attend to the target properties at every generation step.
  • Optimization: A dual optimization strategy was employed, using a lower learning rate for the pre-trained backbone and a higher learning rate for the condition encoder to prevent catastrophic forgetting.

Evaluation

Metrics

The model is evaluated based on:

  1. Validity: Percentage of generated files that are valid CIFs.
  2. Hit-Rate: The fraction of generated structures where the predicted property (via surrogate model) falls within a tolerance of the target property.
  3. VSUN: A composite metric ensuring structures are Valid, Stable (low $E_{hull}$), Unique, and Novel.

Results

As detailed in Figure 3 of the associated preprint, the PKV (Prefix) architecture demonstrates strong capability in steering generation toward target band gaps, particularly when compared to sequence-level conditioning baselines.

Citation

@misc{bone2025discoveryrecoverycrystallinematerials,
      title={Discovery and recovery of crystalline materials with property-conditioned transformers}, 
      author={Cyprien Bone and Matthew Walker and Kuangdai Leng and Luis M. Antunes and Ricardo Grau-Crespo and Amil Aligayev and Javier Dominguez and Keith T. Butler},
      year={2025},
      eprint={2511.21299},
      archivePrefix={arXiv},
      primaryClass={cond-mat.mtrl-sci},
      url={[https://arxiv.org/abs/2511.21299](https://arxiv.org/abs/2511.21299)}, 
}
Downloads last month
83
Safetensors
Model size
61.6M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for c-bone/CrystaLLM-pi_bandgap

Finetuned
(4)
this model

Dataset used to train c-bone/CrystaLLM-pi_bandgap