flux-dev-tu-ya-ya

The model isn’t very good yet, I’m stuck, I’ll keep training.

I’m rebuilding the dataset, but like I said, I’m stuck.

I need to get away from this for a while, both mentally and physically.

Probably a few days.

If someone is willing to donate money for this, like 100 yuan, I will be very happy and will speed up the progress.

This is a standard PEFT LoRA derived from flux/unknown-model.

No validation prompt was used during training.

None

Validation settings

CFG: 3.0
CFG Rescale: 0.0
Steps: 20
Sampler: FlowMatchEulerDiscreteScheduler
Seed: 42
Resolutions: 1024x1024,1280x768
Skip-layer guidance:

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)

Negative Prompt
blurry, cropped, ugly

Prompt
a futuristic anime-style portrait of a young girl analyzing holographic data on a spaceship bridge, surrounded by glowing interfaces and cosmic vistas through viewports

Negative Prompt
blurry, cropped, ugly

Prompt
a warm intimate portrait of a girl immersed in reading within a cozy library nook, golden hour light filtering through shelves of ancient tomes

Negative Prompt
blurry, cropped, ugly

Prompt
a detailed cinematic scene featuring a doctor in lab coat and curious child examining advanced medical equipment in a hybrid clinic-library space

Negative Prompt
blurry, cropped, ugly

Prompt
a stylized isometric render of a contemporary bedroom-study with modular furniture, floating bookshelves and creative tech integrations

Negative Prompt
blurry, cropped, ugly

Prompt
a vaporwave-inspired scene blending 1980s aesthetics with holographic interfaces, featuring a girl in retro clothing interacting with CRT-style displays

Negative Prompt
blurry, cropped, ugly

Prompt
a tender moment capturing a father-figure teaching a child robotics repair in a workshop filled with half-assembled gadgets and engineering blueprints

Negative Prompt
blurry, cropped, ugly

Prompt
a dynamic composition showing a content creator using tablet with AR interface in a maker-space studio, surrounded by 3D printers and prototype gadgets

Negative Prompt
blurry, cropped, ugly

Prompt
a high-contrast cinematic still featuring temporal displacement effects around characters, with glowing chrono-interface displaying 2058 date codes

Negative Prompt
blurry, cropped, ugly

Prompt
a warm documentary-style photo of mentor and student collaborating on science project using mixed reality tablets in home laboratory

Negative Prompt
blurry, cropped, ugly

Prompt
a concept art scene blending domestic comfort with advanced biotech - living furniture, nanobot clouds maintaining books, adaptive architecture

Negative Prompt
blurry, cropped, ugly

Prompt
a Bilibili-branded content creator setup showing seamless integration of streaming tech into cozy living space with perfect lighting balance

Negative Prompt
blurry, cropped, ugly

Prompt
a family scene in autonomous electric vehicle with full AR windshield displays, child reaching toward holographic navigation interface

Negative Prompt
blurry, cropped, ugly

Prompt
a striking portrait of scientist-parent figure in convertible lab coat/domestic attire, holding both medical tablet and child's toy robot

Negative Prompt
blurry, cropped, ugly

Prompt
a diptych composition contrasting retro 1990s computer equipment with futuristic 2058 hologram tech through family timeline perspective

Negative Prompt
blurry, cropped, ugly

Prompt
a transformable room concept shifting between medical lab, maker space and cozy bedroom through modular walls and augmented reality overlays

Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 48
Training steps: 10000
Learning rate: 0.0001
- Learning rate schedule: polynomial
- Warmup steps: 100
Max grad norm: 2.0
Effective batch size: 1
- Micro-batch size: 1
- Gradient accumulation steps: 1
- Number of GPUs: 1
Gradient checkpointing: True
Prediction type: flow-matching (extra parameters=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
Optimizer: adamw_bf16
Trainable parameter precision: Pure BF16
Caption dropout probability: 10.0%
LoRA Rank: 16
LoRA Alpha: None
LoRA Dropout: 0.1
LoRA initialisation style: default

Datasets

dreambooth-512

Repeats: 1
Total number of images: 52
Total number of aspect buckets: 1
Resolution: 0.262144 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

dreambooth-1024

Repeats: 1
Total number of images: 52
Total number of aspect buckets: 1
Resolution: 1.048576 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

Inference

import torch
from diffusers import DiffusionPipeline

model_id = '/root/autodl-tmp/checkout-redbook/flux-dev-china-girl-lora-merge'
adapter_id = 'likewendy/flux-dev-tu-ya-ya'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."


## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.transformer, weights=qint8)
#freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

likewendy
/

flux-dev-tu-ya-ya