Akirashindo39/kanji-diffusion-v1-4-kanjidic2
This model is a text-to-image diffusion model capable of hallucinating Kanji characters given any English prompt.
Fine-tuned Model Details
- Developed by: Akira Shindo
- Model type: Diffusion-based text-to-image generation model, fine-tuned on Stable Diffusion v1.4 model using the Akirashindo39/KANJIDIC2 dataset.
How to use
Use Google Colab to run the following script. It is recommended to use a GPU (such as a T4 GPU) to run the script, or else it will take a long time to process. Make sure you have your Huggingface API KEY / ACCESS TOKEN for this.
!pip install diffusers
!git clone https://github.com/huggingface/diffusers
!huggingface-cli login
import os
from google.colab import drive
# Mount Google Drive to access persistent storage across Colab sessions
drive.mount('/content/drive')
# Navigate to the project directory in Google Drive
os.chdir("/content/drive/MyDrive")
from diffusers import StableDiffusionPipeline
import torch
torch.cuda.empty_cache()
model_path = "Akirashindo39/kanji-diffusion-v1-4-kanjidic2"
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float16,
use_safetensors=True
).to("cuda")
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")
new_kanji_meaning = "internet" # Enter new kanji meaning here
prompt = f"a Kanji meaning {new_kanji_meaning}"
image = pipe(prompt).images[0]
image.save(f"{new_kanji_meaning}-kanji-v1-4.png")
Training details
Hardware Used: 8GB RAM and T4 GPU on Colab
The training script below was executed, completing in approximately two hours.
# Launch LoRA fine-tuning for text-to-image model with accelerate
!accelerate launch train_text_to_image_lora.py \
--pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" \
--dataset_name="Akirashindo39/KANJIDIC2" \
--image_column="image" \
--caption_column="text" \
--resolution=512 \
--random_flip \
--train_batch_size=1 \
--num_train_epochs=1 \
--checkpointing_steps=2000 \
--learning_rate=1e-04 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--seed=42 \
--output_dir="Akirashindo39/kanji-diffusion-v1-4-kanjidic2" \
--validation_prompt="A kanji meaning Elon Musk" \
--push_to_hub
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Akirashindo39/kanji-diffusion-v1-4-kanjidic2
Base model
CompVis/stable-diffusion-v1-4