genxd / README.md
Yuyang-z's picture
Update README.md
8fa7a98 verified
metadata
tags:
  - Image-to-3D
  - Image-to-4D
  - GenXD
language:
  - en
  - zh
base_model:
  - stabilityai/stable-video-diffusion-img2vid-xt
pipeline_tag: image-to-3d
license: apache-2.0
datasets:
  - Yuyang-z/CamVid-30K

GenXD Model Card

logo

         

Model Details

teaser_page1

Model Description

GenXD leverages mask latent conditioned diffusion model to generate 3D and 4D samples with both camera and image conditions. In addition, multiview-temporal modules together with alpha-fusing are proposed to effectively disentangle and fuse multiview and temporal information.

  • Developed by: NUS, Microsoft
  • Model type: image-to-3D diffusion model, image-to-video diffusion model, image-to-4D diffusion model
  • License: Apache-2.0

Model Sources

Uses

Direct Use

The model is intended for research purposes only. Possible research areas and tasks include

  • Generation of artworks and use in design and other artistic processes.

  • Applications in educational or creative tools.

  • Research on generative models.

  • Safe deployment of models which have the potential to generate harmful content.

  • Probing and understanding the limitations and biases of generative models.

Excluded uses are described below.

Out-of-Scope Use

The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

Limitations and Bias

Limitations

  • The model does not achieve perfect photorealism.
  • The model does not achieve perfect 3D consistency.

Bias

While the capabilities of generation model is impressive, it can also reinforce or exacerbate social biases.