Model Card

These are the model checkpoints used in the paper VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models.

Currently we release the Qwen2.5 VLM checkpoints as well as necessary networks for training. We will release all checkpoints after the paper gets accepted.

Source

Project Page: https://nus-lins-lab.github.io/vlaos/
Paper: https://arxiv.org/abs/2506.17561
Code: https://github.com/HeegerGao/VLA-OS
Data: https://huggingface.co/datasets/Linslab/VLA-OS-Dataset

Usage

Ensure you have installed git lfs:

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install

Then download this repo:

git clone https://huggingface.co/Linslab/VLA-OS

Model Description

Please refer to the codebase for more description and usage.

Citation

If you find our work helpful, please cite us:

@article{gao2025vlaos,
  title   = {VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models},
  author  = {Gao, Chongkai and Liu, Zixuan and Chi, Zhenghao and Huang, Junshan and Fei, Xin and Hou, Yiwen and Zhang, Yuxuan and Lin, Yudi and Fang, Zhirui and Jiang, Zeyu and Shao, Lin},
  journal = {arXiv preprint arXiv:2506.17561},
  year    = {2025},
  url     = {https://arxiv.org/abs/2506.17561}
}

Thank you!