DATAGRID-research's picture
Update README.md
9200501 verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - LanguageBind/Open-Sora-Plan-v1.3.0

DATAGRID-Open-Sora-Plan-v1.3.0-0.16M

DATAGRID-Open-Sora-Plan-v1.3.0-0.16M is a Text-to-Video diffusion model based on the Open-Sora-Plan architecture. It has been fine-tuned by DATAGRID Inc. on a custom dataset of 0.16 million royalty-free video clips to generate high-quality videos from text prompts.

Model Details

Model Description

This model extends the capabilities of Open-Sora-Plan by fine-tuning it on a curated, proprietary dataset.

Training Details

  • Training Data: Fine-tuned on a custom dataset of 0.16 million royalty-free video-text pairs. This dataset was independently collected and curated by DATAGRID Inc., focusing on diverse scenes, motions, and objects. For V2V inpainting training data preparation, we built an automated mask generation pipeline utilizing state-of-the-art models like Meta AI's SAM2 (Segment Anything Model 2) and Microsoft's Florence2 to automatically generate masks for target objects in videos. This significantly improved efficiency and reduced costs compared to traditional manual annotation methods.

Inference Details

Our fork of Open-Sora-Plan with added mask handling capabilities for inpainting dataset and pipeline is available at DATAGRID-Research-org/Open-Sora-Plan. This fork extends the original model with improved inpainting functionality through enhanced mask processing.

Results

T2V

category prompt Open-Sora-Plan-v1.3.0 DATAGRID-Open-Sora-Plan-v1.3.0-0.16M
dynamic degree an airplane accelerating to gain speed
object class a bicycle
human action A person is ice skating
color A pink bird
imaging quality this is how I do makeup in the morning
spatial relationship a kite on the top of a skateboard, front view

V2V(Inpainting)

original video prompt (short) mask DATAGRID-Open-Sora-Plan-v1.3.0-0.16M(Inpaint)
A juicer pours orange juice into a container. ・・・Morning juice-making scene.
A reddish-brown cat resting in a grassy field, ・・・relaxed and content.

License

This model is released under the Apache 2.0 License.

Citation

Citation information will be provided at a later date.