EgoX: Egocentric Video Generation from a Single Exocentric Video

This repository provides model weights of EgoX, a video-to-video generation model that synthesizes egocentric (first-person) videos from a single exocentric (third-person) video.
EgoX is built on top of a large-scale video diffusion backbone and enables exo-to-ego viewpoint transformation without requiring multi-view inputs.

For detailed results, implementation details, and demo videos, please refer to our paper and project repository.


Usage

Please refer to the Quick Start section for instructions on running inference and required preprocessing steps.


Citation

If you find this model or code useful in your research, please cite our paper:

@misc{kang2025egoxegocentricvideogeneration,
  title={EgoX: Egocentric Video Generation from a Single Exocentric Video},
  author={Taewoong Kang and Kinam Kim and Dohyeon Kim and Minho Park and Junha Hyung and Jaegul Choo},
  year={2025},
  eprint={2512.08269},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2512.08269},
}

Acknowledgement

This work builds upon the valuable open-source efforts of
4DNeX and
EgoExo4D.

We sincerely appreciate their contributions to the computer vision and robotics communities.

Downloads last month
186
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DAVIAN-Robotics/EgoX

Finetuned
(3)
this model

Collection including DAVIAN-Robotics/EgoX