metadata

license: cc-by-nc-4.0
language:
  - en
base_model:
  - stabilityai/stable-diffusion-3.5-medium

Text-Image-to-Image Trained on MANGA109 Pose HA Manga Images

This repository contains an image generation model trained using manga images from MANGA109 Pose tools. Please create the conditional input images using the repository linked above.

Training Parameters

Argument	Value
resolution	512
train batch size	4
learning rate	1e-05
mixed precision	fp16
max train steps	200,000

Training Dataset

The MANGA109 Pose HA dataset was split into training, validation, and test sets in an 8:1:1 ratio.

Author's Environment

GPU：H100NVL (1 unit)
CUDA：12.4
PyTorch：2.6.0+cu124
diffusers: 0.33.0.dev0

Computation Time

Training was conducted on a single H100 NVL GPU (94GB) and took 88 hours. Each training step took approximately 1.58 seconds.

License

This repository is licensed under the Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) .

Citation

If you use this repository in your research, please consider citing it using the following BibTeX entry:

@article{okada2025manga109pose,
  title={MANGA109 に姿勢情報を追加したデータセットの構築による姿勢を制御した漫画キャラクター画像生成},
  author={岡田 湧路 and 北川 峻 and 渡邉 謙吾 and 稲葉 通将 and 橋本 敦史 and 栗原 聡},
  journal={人工知能学会全国大会論文集},
  volume={JSAI2025},
  pages={2O1GS1005-2O1GS1005}
  year={2025}
}

Update History

2025/04/25: [Public Release]