|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
base_model: |
|
- stabilityai/stable-diffusion-3.5-medium |
|
--- |
|
|
|
# Text-Image-to-Image Trained on MANGA109 Pose HA Manga Images |
|
|
|
This repository contains an image generation model trained using manga images from [MANGA109 Pose tools](https://github.com/kuri-lab/MANGA109-Pose-tools). |
|
Please create the conditional input images using the repository linked above. |
|
|
|
## Training Parameters |
|
|Argument | Value | |
|
| ---- | ---- | |
|
|resolution | 512 | |
|
|train batch size | 4 | |
|
|learning rate | 1e-05 | |
|
|mixed precision | fp16 | |
|
|max train steps | 200,000 | |
|
|
|
## Training Dataset |
|
The MANGA109 Pose HA dataset was split into training, validation, and test sets in an 8:1:1 ratio. |
|
|
|
## Author's Environment |
|
- GPU:H100NVL (1 unit) |
|
- CUDA:12.4 |
|
- PyTorch:2.6.0+cu124 |
|
- diffusers: 0.33.0.dev0 |
|
|
|
## Computation Time |
|
Training was conducted on a single H100 NVL GPU (94GB) and took 88 hours. |
|
Each training step took approximately 1.58 seconds. |
|
|
|
## License |
|
This repository is licensed under the [Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) ](https://creativecommons.org/licenses/by-nc/4.0/deed.en). |
|
|
|
## Citation |
|
If you use this repository in your research, please consider citing it using the following BibTeX entry: |
|
``` |
|
@article{okada2025manga109pose, |
|
title={MANGA109 に姿勢情報を追加したデータセットの構築による姿勢を制御した漫画キャラクター画像生成}, |
|
author={岡田 湧路 and 北川 峻 and 渡邉 謙吾 and 稲葉 通将 and 橋本 敦史 and 栗原 聡}, |
|
journal={人工知能学会全国大会論文集}, |
|
volume={JSAI2025}, |
|
pages={2O1GS1005-2O1GS1005} |
|
year={2025} |
|
} |
|
``` |
|
|
|
Update History |
|
* 2025/04/25: [Public Release] |