Papers
arxiv:2506.18851

Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

Published on Jun 23
· Submitted by ZhuoweiChen on Jun 24
Authors:
,
,
,
,
,

Abstract

A cross-pair dataset called Phantom-Data improves subject-to-video generation by enhancing prompt alignment and visual quality while maintaining identity consistency.

AI-generated summary

Subject-to-video generation has witnessed substantial progress in recent years. However, existing models still face significant challenges in faithfully following textual instructions. This limitation, commonly known as the copy-paste problem, arises from the widely used in-pair training paradigm. This approach inherently entangles subject identity with background and contextual attributes by sampling reference images from the same scene as the target video. To address this issue, we introduce Phantom-Data, the first general-purpose cross-pair subject-to-video consistency dataset, containing approximately one million identity-consistent pairs across diverse categories. Our dataset is constructed via a three-stage pipeline: (1) a general and input-aligned subject detection module, (2) large-scale cross-context subject retrieval from more than 53 million videos and 3 billion images, and (3) prior-guided identity verification to ensure visual consistency under contextual variation. Comprehensive experiments show that training with Phantom-Data significantly improves prompt alignment and visual quality while preserving identity consistency on par with in-pair baselines.

Community

Paper author Paper submitter
Paper author Paper submitter

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2506.18851 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2506.18851 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2506.18851 in a Space README.md to link it from this page.

Collections including this paper 4