Prompt
black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed

Prompt
(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers

Prompt
high quality pixel art, a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel. The image should embody epic portraiture and double exposure, featuring an isolated landscape visible through the window. The colors should primarily be dynamic and action-packed, with a strong use of negative space. The entire artwork should be in pixel art style, emphasizing the characters shape and set against a white background. Silhouette

Prompt
The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze.

Prompt
Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd

Prompt
cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy

Prompt
1980s anime portrait of a character

ProteusV0.5

ProteusV0.5 is the latest full release of my AI image generation model, built as a sophisticated enhancement over OpenDalleV1.1. This version brings significant improvements in photorealism, prompt comprehension, and stylistic capabilities across various domains. About Proteus Proteus leverages and enhances the core functionalities of OpenDalleV1.1 to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. The model has been fine-tuned using a carefully curated dataset of copyright-free stock images and high-quality AI-generated image pairs.

Key Improvements in V0.5:

Advanced Custom CLIP Integration:

Incorporates a meticulously trained custom CLIP model
Steadily developed over an extended period
Further fine-tuned for specific qualities in Proteus and Prometheus
Estimated to contribute 90% of the model's performance improvements
Requires a clip skip setting of 2 for optimal performance
Estimated to be responsible for 90% of the improvements in this version

Further Refinement of Stylistic Capabilities:

Enhanced ability to generate diverse artistic styles
Improved coherence in complex scenes and compositions

Expanded Training Dataset:

Now totaling over 400,000 images
Significantly broadened knowledge base and generation capabilities

Balanced Creativity and Accuracy:

Addressed previous issues of being "too stylistic/creative"
Improved alignment between user prompts and generated outputs

Proteus's Background

Proteus serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs. In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.

Training Details

Total training dataset: Now over 400,000 images Initial training: ~220,000 GPTV captioned images from copyright-free stock images (including some anime) Additional training: Hand-picked photorealistic images Fine-tuning: Direct Preference Optimization (DPO) with 10,000 carefully selected high-quality, AI-generated image pairs LORA (Low-Rank Adaptation) models trained independently and selectively incorporated

Improvements

Enhanced portrayal of intricate facial characteristics and lifelike skin textures Improved proficiency in surrealism, anime, and cartoon-style visualizations Superior prompt comprehension due to custom-trained CLIP Expanded dataset leading to more diverse and accurate outputs Refined balance between creativity and accuracy

Recommended Settings

Clip Skip: 2 CFG Scale: 7 Steps: 25 - 50 Sampler: DPM++ 2M SDE Scheduler: Karras Resolution: 1024x1024

The custom-trained CLIP is a significant point of differentiation, as very few models incorporate this feature. Enjoy creating with the fully released ProteusV0.5!

Use it with 🧨 diffusers

import torch
from diffusers import (
    StableDiffusionXLPipeline, 
    KDPM2AncestralDiscreteScheduler,
    AutoencoderKL
)

# Load VAE component
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", 
    torch_dtype=torch.float16
)

# Configure the pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "dataautogpt3/ProteusV0.5", 
    vae=vae,
    torch_dtype=torch.float16
)
pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')

# Define prompts and generate image
prompt = "a cat wearing sunglasses on the beach"
negative_prompt = ""

image = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=7,
    num_inference_steps=50,
    clip_skip=2
).images[0]


image.save("generated_image.png")