Matrix-Game: Interactive World Foundation Model
π Overview
Matrix-Game is a 17B-parameter interactive world foundation model for controllable game world generation.
β¨ Key Features
- π― Feature 1: Interactive Generation. A diffusion-based image-to-world model that generates high-quality videos conditioned on keyboard and mouse inputs, enabling fine-grained control and dynamic scene evolution.
- π Feature 2: GameWorld Score. A comprehensive benchmark for evaluating Minecraft world models across four key dimensions, including visual quality, temporal quality, action controllability, and physical rule understanding.
- π‘ Feature 3: Matrix-Game Dataset A large-scale Minecraft dataset with fine-grained action annotations, supporting scalable training for interactive and physically grounded world modeling.
π₯ Latest Updates
- [2025-05] π Initial release of Matrix-Game Model
π Performance Comparison
GameWorld Score Benchmark Comparison
Model | Image Quality β | Aesthetic Quality β | Temporal Cons. β | Motion Smooth. β | Keyboard Acc. β | Mouse Acc. β | 3D Cons. β |
---|---|---|---|---|---|---|---|
Oasis | 0.65 | 0.48 | 0.94 | 0.98 | 0.77 | 0.56 | 0.56 |
MineWorld | 0.69 | 0.47 | 0.95 | 0.98 | 0.86 | 0.64 | 0.51 |
Ours | 0.72 | 0.49 | 0.97 | 0.98 | 0.95 | 0.95 | 0.76 |
Metric Descriptions:
Image Quality / Aesthetic: Visual fidelity and perceptual appeal of generated frames
Temporal Consistency / Motion Smoothness: Temporal coherence and smoothness between frames
Keyboard Accuracy / Mouse Accuracy: Accuracy in following user control signals
3D Consistency: Geometric stability and physical plausibility over time
Please check our GameWorld benchmark for detailed implementation.
Human Evaluation
Double-blind human evaluation by two independent groups across four key dimensions: Overall Quality, Controllability, Visual Quality, and Temporal Consistency.
Scores represent the percentage of pairwise comparisons in which each method was preferred. Matrix-Game consistently outperforms prior models across all metrics and both groups.
π Quick Start
# clone the repository:
git clone https://github.com/SkyworkAI/Matrix-Game.git
cd Matrix-Game
# install dependencies:
pip install -r requirements.txt
# install apex and FlashAttention-3
# Our project also depends on [apex](https://github.com/NVIDIA/apex) and [FlashAttention-3](https://github.com/Dao-AILab/flash-attention)
# inference
bash run_inference.sh
β Acknowledgements
We would like to express our gratitude to:
- Diffusers for their excellent diffusion model framework
- HunyuanVideo for their strong base model
- MineDojo for their Minecraft video dataset
- MineRL for their excellent gym framework
- Video-Pre-Training for their accurate Inverse Dynamics Model
- GameFactory for their idea of action control module
We are grateful to the broader research community for their open exploration and contributions to the field of interactive world generation.
π Citation
If you find this project useful, please cite our paper:
@article{zhang2025matrixgame,
title = {Matrix-Game: Interactive World Foundation Model},
author = {Yifan Zhang and Chunli Peng and Boyang Wang and Puyi Wang and Qingcheng Zhu and Zedong Gao and Eric Li and Yang Liu and Yahui Zhou},
journal = {arXiv},
year = {2025}
}
- Downloads last month
- 0