fev12345's picture
Upload folder using huggingface_hub
85b5765 verified
metadata
license: cc-by-nc-4.0
language:
  - en
base_model:
  - stabilityai/stable-diffusion-3.5-medium

Text-Image-to-Image Trained on MANGA109 Pose HA Manga Images

This repository contains an image generation model trained using manga images from MANGA109 Pose tools. Please create the conditional input images using the repository linked above.

Training Parameters

Argument Value
resolution 512
train batch size 4
learning rate 1e-05
mixed precision fp16
max train steps 200,000

Training Dataset

The MANGA109 Pose HA dataset was split into training, validation, and test sets in an 8:1:1 ratio.

Author's Environment

  • GPU:H100NVL (1 unit)
  • CUDA:12.4
  • PyTorch:2.6.0+cu124
  • diffusers: 0.33.0.dev0

Computation Time

Training was conducted on a single H100 NVL GPU (94GB) and took 88 hours. Each training step took approximately 1.58 seconds.

License

This repository is licensed under the Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) .

Citation

If you use this repository in your research, please consider citing it using the following BibTeX entry:

@article{okada2025manga109pose,
  title={MANGA109 に姿勢情報を追加したデータセットの構築による姿勢を制御した漫画キャラクター画像生成},
  author={岡田 湧路 and 北川 峻 and 渡邉 謙吾 and 稲葉 通将 and 橋本 敦史 and 栗原 聡},
  journal={人工知能学会全国大会論文集},
  volume={JSAI2025},
  pages={2O1GS1005-2O1GS1005}
  year={2025}
}

Update History

  • 2025/04/25: [Public Release]