Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,235 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: openrail++
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
pipeline_tag: text-to-image
|
6 |
+
tags:
|
7 |
+
- stable-diffusion
|
8 |
+
- stable-diffusion-diffusers
|
9 |
+
inference: true
|
10 |
+
widget:
|
11 |
+
- text: >-
|
12 |
+
masterpiece, best quality, 1girl, brown hair, green eyes, colorful, autumn,
|
13 |
+
cumulonimbus clouds, lighting, blue sky, falling leaves, garden
|
14 |
+
example_title: example 1girl
|
15 |
+
- text: >-
|
16 |
+
masterpiece, best quality, 1boy, medium hair, blonde hair, blue eyes,
|
17 |
+
bishounen, colorful, autumn, cumulonimbus clouds, lighting, blue sky,
|
18 |
+
falling leaves, garden
|
19 |
+
example_title: example 1boy
|
20 |
+
library_name: diffusers
|
21 |
+
---
|
22 |
+
|
23 |
+
<style>
|
24 |
+
.title {
|
25 |
+
font-size: 2.5em;
|
26 |
+
text-align: center;
|
27 |
+
color: #333;
|
28 |
+
font-family: Arial, sans-serif;
|
29 |
+
text-transform: uppercase;
|
30 |
+
letter-spacing: 0.05em;
|
31 |
+
padding: 0.5em 0;
|
32 |
+
background: transparent;
|
33 |
+
box-shadow: 0px 0px 20px 0px rgba(0,0,0,0.15);
|
34 |
+
margin-bottom: 2em;
|
35 |
+
display: inline-block;
|
36 |
+
width: auto;
|
37 |
+
}
|
38 |
+
.title span {
|
39 |
+
background: -webkit-linear-gradient(45deg, #fe6b8b 30%, #ff8e53 90%);
|
40 |
+
-webkit-background-clip: text;
|
41 |
+
-webkit-text-fill-color: transparent;
|
42 |
+
}
|
43 |
+
.image-grid {
|
44 |
+
display: grid;
|
45 |
+
grid-template-columns: repeat(3, 1fr);
|
46 |
+
gap: 0.5em;
|
47 |
+
}
|
48 |
+
.image-item {
|
49 |
+
box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
|
50 |
+
padding: 10px;
|
51 |
+
}
|
52 |
+
.image-item img {
|
53 |
+
width: 100%;
|
54 |
+
height: 100%;
|
55 |
+
object-fit: cover;
|
56 |
+
border-radius: 10px;
|
57 |
+
transition: transform .2s;
|
58 |
+
}
|
59 |
+
.image-item img:hover {
|
60 |
+
transform: scale(1.1);
|
61 |
+
}
|
62 |
+
.custom-table {
|
63 |
+
table-layout: fixed;
|
64 |
+
width: 100%;
|
65 |
+
border-collapse: collapse;
|
66 |
+
}
|
67 |
+
.custom-table td {
|
68 |
+
width: 50%;
|
69 |
+
vertical-align: top;
|
70 |
+
padding: 10px;
|
71 |
+
box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
|
72 |
+
}
|
73 |
+
.custom-image {
|
74 |
+
width: 100%;
|
75 |
+
height: 100%;
|
76 |
+
object-fit: cover;
|
77 |
+
border-radius: 10px;
|
78 |
+
transition: transform .2s;
|
79 |
+
}
|
80 |
+
.custom-image:hover {
|
81 |
+
transform: scale(1.1);
|
82 |
+
}
|
83 |
+
</style>
|
84 |
+
|
85 |
+
<h1 class="title"><span>Hermitage XL</span></h1>
|
86 |
+
|
87 |
+
<div class="image-grid">
|
88 |
+
<div class="image-item">
|
89 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample1.png">
|
90 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample1.png">
|
91 |
+
</a>
|
92 |
+
</div>
|
93 |
+
<div class="image-item">
|
94 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample2.png">
|
95 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample2.png">
|
96 |
+
</a>
|
97 |
+
</div>
|
98 |
+
<div class="image-item">
|
99 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample3.png">
|
100 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample3.png">
|
101 |
+
</a>
|
102 |
+
</div>
|
103 |
+
<div class="image-item">
|
104 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample4.png">
|
105 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample4.png">
|
106 |
+
</a>
|
107 |
+
</div>
|
108 |
+
<div class="image-item">
|
109 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample5.png">
|
110 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample5.png">
|
111 |
+
</a>
|
112 |
+
</div>
|
113 |
+
<div class="image-item">
|
114 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample6.png">
|
115 |
+
<img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample6.png">
|
116 |
+
</a>
|
117 |
+
</div>
|
118 |
+
</div>
|
119 |
+
|
120 |
+
## Overview
|
121 |
+
|
122 |
+
Hermitage XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of 4e-7 over 5000 steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0.
|
123 |
+
|
124 |
+
e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
|
125 |
+
|
126 |
+
- Use it with the [`Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui)
|
127 |
+
- Use it with 🧨 [`diffusers`](https://huggingface.co/docs/diffusers/index)
|
128 |
+
- Use it with the [`ComfyUI`](https://github.com/comfyanonymous/ComfyUI)
|
129 |
+
|
130 |
+
## Features
|
131 |
+
|
132 |
+
1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using [NovelAI Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be trained at non-square resolutions.
|
133 |
+
2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images.
|
134 |
+
3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.
|
135 |
+
|
136 |
+
## Model Details
|
137 |
+
|
138 |
+
- **Developed by:** [Linaqruf](https://github.com/Linaqruf)
|
139 |
+
- **Model type:** Diffusion-based text-to-image generative model
|
140 |
+
- **Model Description:** This is a model that can be used to generate and modify anime-themed images based on text prompts.
|
141 |
+
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
|
142 |
+
- **Finetuned from model:** [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
|
143 |
+
|
144 |
+
## How to Use:
|
145 |
+
- Download `Hermitage XL` [here](https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/hermitage-xl.safetensors), the model is in `.safetensors` format.
|
146 |
+
- You need to use Danbooru-style tag as prompt instead of natural language, otherwise you will get realistic result instead of anime
|
147 |
+
- You can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse:
|
148 |
+
```
|
149 |
+
lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
|
150 |
+
```
|
151 |
+
- And, the following should also be prepended to prompts to get high aesthetic results:
|
152 |
+
```
|
153 |
+
masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details
|
154 |
+
```
|
155 |
+
|
156 |
+
## 🧨 Diffusers
|
157 |
+
|
158 |
+
Make sure to upgrade diffusers to >= 0.18.2:
|
159 |
+
```
|
160 |
+
pip install diffusers --upgrade
|
161 |
+
```
|
162 |
+
|
163 |
+
In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark:
|
164 |
+
```
|
165 |
+
pip install invisible_watermark transformers accelerate safetensors
|
166 |
+
```
|
167 |
+
|
168 |
+
Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**:
|
169 |
+
```py
|
170 |
+
import torch
|
171 |
+
from torch import autocast
|
172 |
+
from diffusers.models import AutoencoderKL
|
173 |
+
from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
|
174 |
+
|
175 |
+
model = "Linaqruf/hermitage-xl"
|
176 |
+
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
|
177 |
+
|
178 |
+
pipe = StableDiffusionXLPipeline.from_pretrained(
|
179 |
+
model,
|
180 |
+
torch_dtype=torch.float16,
|
181 |
+
use_safetensors=True,
|
182 |
+
variant="fp16",
|
183 |
+
vae=vae
|
184 |
+
)
|
185 |
+
|
186 |
+
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
|
187 |
+
pipe.to('cuda')
|
188 |
+
|
189 |
+
prompt = "masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck"
|
190 |
+
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
|
191 |
+
|
192 |
+
image = pipe(
|
193 |
+
prompt,
|
194 |
+
negative_prompt=negative_prompt,
|
195 |
+
width=1024,
|
196 |
+
height=1024,
|
197 |
+
guidance_scale=12,
|
198 |
+
target_size=(1024,1024),
|
199 |
+
original_size=(4096,4096),
|
200 |
+
num_inference_steps=50
|
201 |
+
).images[0]
|
202 |
+
|
203 |
+
image.save("anime_girl.png")
|
204 |
+
```
|
205 |
+
|
206 |
+
## Limitation
|
207 |
+
1. This model inherit Stable Diffusion XL 1.0 [limitation](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0#limitations)
|
208 |
+
2. This model is overfitted and cannot follow prompts well, because it's fine-tuned for 5000 steps with small scale datasets.
|
209 |
+
3. It's only a preview model to find good hyperparameter and training config for Stable Diffusion XL 1.0
|
210 |
+
|
211 |
+
## Example
|
212 |
+
|
213 |
+
Here is some cherrypicked samples and comparison between available models:
|
214 |
+
|
215 |
+
<table class="custom-table">
|
216 |
+
<tr>
|
217 |
+
<td>
|
218 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image1.png">
|
219 |
+
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image1.png" alt="sample1">
|
220 |
+
</a>
|
221 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image3.png">
|
222 |
+
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image3.png" alt="sample3">
|
223 |
+
</a>
|
224 |
+
</td>
|
225 |
+
<td>
|
226 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image2.png">
|
227 |
+
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image2.png" alt="sample2">
|
228 |
+
</a>
|
229 |
+
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image4.png">
|
230 |
+
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image4.png" alt="sample4">
|
231 |
+
</a>
|
232 |
+
</td>
|
233 |
+
</tr>
|
234 |
+
</table>
|
235 |
+
<hr>
|