feature/diffusers-model
#3
by
ayan4m1
- opened
- README.md +5 -69
- feature_extractor/preprocessor_config.json +0 -20
- model_index.json +0 -32
- safety_checker/config.json +0 -179
- safety_checker/pytorch_model.bin +0 -3
- scheduler/scheduler_config.json +0 -13
- text_encoder/config.json +0 -25
- text_encoder/pytorch_model.bin +0 -3
- tokenizer/merges.txt +0 -0
- tokenizer/special_tokens_map.json +0 -24
- tokenizer/tokenizer_config.json +0 -34
- tokenizer/vocab.json +0 -0
- unet/config.json +0 -36
- unet/diffusion_pytorch_model.bin +0 -3
- vae/config.json +0 -30
- vae/diffusion_pytorch_model.bin +0 -3
README.md
CHANGED
@@ -1,81 +1,17 @@
|
|
1 |
---
|
2 |
-
inference:
|
3 |
tags:
|
4 |
- stable-diffusion
|
5 |
- stable-diffusion-diffusers
|
6 |
- text-to-image
|
7 |
-
license:
|
8 |
---
|
9 |
|
10 |
## Please Note!
|
11 |
|
12 |
-
This model is NOT the 19.2M images Characters Model on TrinArt, but an improved version of the original
|
13 |
-
|
14 |
-
Other TrinArt models can be found at:
|
15 |
-
|
16 |
-
https://huggingface.co/naclbit/trinart_derrida_characters_v2_stable_diffusion
|
17 |
-
|
18 |
-
https://huggingface.co/naclbit/trinart_characters_19.2m_stable_diffusion_v1
|
19 |
-
|
20 |
-
|
21 |
-
## Diffusers
|
22 |
-
|
23 |
-
The model has been ported to `diffusers` by [ayan4m1](https://huggingface.co/ayan4m1)
|
24 |
-
and can easily be run from one of the branches:
|
25 |
-
- `revision="diffusers-60k"` for the checkpoint trained on 60,000 steps,
|
26 |
-
- `revision="diffusers-95k"` for the checkpoint trained on 95,000 steps,
|
27 |
-
- `revision="diffusers-115k"` for the checkpoint trained on 115,000 steps.
|
28 |
-
|
29 |
-
For more information, please have a look at [the "Three flavors" section](#three-flavors).
|
30 |
-
|
31 |
-
## Gradio
|
32 |
-
|
33 |
-
We also support a [Gradio](https://github.com/gradio-app/gradio) web ui with diffusers to run inside a colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1RWvik_C7nViiR9bNsu3fvMR3STx6RvDx?usp=sharing)
|
34 |
-
|
35 |
-
|
36 |
-
### Example Text2Image
|
37 |
-
|
38 |
-
```python
|
39 |
-
# !pip install diffusers==0.3.0
|
40 |
-
from diffusers import StableDiffusionPipeline
|
41 |
-
|
42 |
-
# using the 60,000 steps checkpoint
|
43 |
-
pipe = StableDiffusionPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-60k")
|
44 |
-
pipe.to("cuda")
|
45 |
-
|
46 |
-
image = pipe("A magical dragon flying in front of the Himalaya in manga style").images[0]
|
47 |
-
image
|
48 |
-
```
|
49 |
-
|
50 |
-
![dragon](https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/a_magical_dragon_himalaya.png)
|
51 |
-
|
52 |
-
If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
|
53 |
-
|
54 |
-
### Example Image2Image
|
55 |
-
|
56 |
-
```python
|
57 |
-
# !pip install diffusers==0.3.0
|
58 |
-
from diffusers import StableDiffusionImg2ImgPipeline
|
59 |
-
import requests
|
60 |
-
from PIL import Image
|
61 |
-
from io import BytesIO
|
62 |
-
|
63 |
-
url = "https://scitechdaily.com/images/Dog-Park.jpg"
|
64 |
-
|
65 |
-
response = requests.get(url)
|
66 |
-
init_image = Image.open(BytesIO(response.content)).convert("RGB")
|
67 |
-
init_image = init_image.resize((768, 512))
|
68 |
-
|
69 |
-
# using the 115,000 steps checkpoint
|
70 |
-
pipe = StableDiffusionImg2ImgPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-115k")
|
71 |
-
pipe.to("cuda")
|
72 |
-
|
73 |
-
images = pipe(prompt="Manga drawing of Brad Pitt", init_image=init_image, strength=0.75, guidance_scale=7.5).images
|
74 |
-
image
|
75 |
-
```
|
76 |
-
|
77 |
-
If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
|
78 |
|
|
|
79 |
|
80 |
## Stable Diffusion TrinArt/Trin-sama AI finetune v2
|
81 |
|
@@ -126,4 +62,4 @@ Each images were diffused using K. Crowson's k-lms (from k-diffusion repo) metho
|
|
126 |
|
127 |
#### License
|
128 |
|
129 |
-
|
|
|
1 |
---
|
2 |
+
inference: false
|
3 |
tags:
|
4 |
- stable-diffusion
|
5 |
- stable-diffusion-diffusers
|
6 |
- text-to-image
|
7 |
+
license: apache-2.0
|
8 |
---
|
9 |
|
10 |
## Please Note!
|
11 |
|
12 |
+
This model is NOT the 19.2M images Characters Model on TrinArt, but an improved version of the original trinsama Twitter bot model. This model is intended to retain the original SD's aesthetics as much as possible while nudging the model to anime/manga style.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
+
このモデルはTrinArtのキャラクターズモデル(1920万枚再学習モデル)ではありません! とりんさまAIボットのモデルの改良版です。このモデルはオリジナルのSD v1.4モデルのアートスタイルをできる限り残したまま、アニメ・マンガ方向に調整することを意図しています。
|
15 |
|
16 |
## Stable Diffusion TrinArt/Trin-sama AI finetune v2
|
17 |
|
|
|
62 |
|
63 |
#### License
|
64 |
|
65 |
+
Apache License 2.0
|
feature_extractor/preprocessor_config.json
DELETED
@@ -1,20 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"crop_size": 224,
|
3 |
-
"do_center_crop": true,
|
4 |
-
"do_convert_rgb": true,
|
5 |
-
"do_normalize": true,
|
6 |
-
"do_resize": true,
|
7 |
-
"feature_extractor_type": "CLIPFeatureExtractor",
|
8 |
-
"image_mean": [
|
9 |
-
0.48145466,
|
10 |
-
0.4578275,
|
11 |
-
0.40821073
|
12 |
-
],
|
13 |
-
"image_std": [
|
14 |
-
0.26862954,
|
15 |
-
0.26130258,
|
16 |
-
0.27577711
|
17 |
-
],
|
18 |
-
"resample": 3,
|
19 |
-
"size": 224
|
20 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
model_index.json
DELETED
@@ -1,32 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "StableDiffusionPipeline",
|
3 |
-
"_diffusers_version": "0.6.0",
|
4 |
-
"feature_extractor": [
|
5 |
-
"transformers",
|
6 |
-
"CLIPImageProcessor"
|
7 |
-
],
|
8 |
-
"safety_checker": [
|
9 |
-
"stable_diffusion",
|
10 |
-
"StableDiffusionSafetyChecker"
|
11 |
-
],
|
12 |
-
"scheduler": [
|
13 |
-
"diffusers",
|
14 |
-
"PNDMScheduler"
|
15 |
-
],
|
16 |
-
"text_encoder": [
|
17 |
-
"transformers",
|
18 |
-
"CLIPTextModel"
|
19 |
-
],
|
20 |
-
"tokenizer": [
|
21 |
-
"transformers",
|
22 |
-
"CLIPTokenizer"
|
23 |
-
],
|
24 |
-
"unet": [
|
25 |
-
"diffusers",
|
26 |
-
"UNet2DConditionModel"
|
27 |
-
],
|
28 |
-
"vae": [
|
29 |
-
"diffusers",
|
30 |
-
"AutoencoderKL"
|
31 |
-
]
|
32 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
safety_checker/config.json
DELETED
@@ -1,179 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_commit_hash": "4bb648a606ef040e7685bde262611766a5fdd67b",
|
3 |
-
"_name_or_path": "CompVis/stable-diffusion-safety-checker",
|
4 |
-
"architectures": [
|
5 |
-
"StableDiffusionSafetyChecker"
|
6 |
-
],
|
7 |
-
"initializer_factor": 1.0,
|
8 |
-
"logit_scale_init_value": 2.6592,
|
9 |
-
"model_type": "clip",
|
10 |
-
"projection_dim": 768,
|
11 |
-
"text_config": {
|
12 |
-
"_name_or_path": "",
|
13 |
-
"add_cross_attention": false,
|
14 |
-
"architectures": null,
|
15 |
-
"attention_dropout": 0.0,
|
16 |
-
"bad_words_ids": null,
|
17 |
-
"begin_suppress_tokens": null,
|
18 |
-
"bos_token_id": 0,
|
19 |
-
"chunk_size_feed_forward": 0,
|
20 |
-
"cross_attention_hidden_size": null,
|
21 |
-
"decoder_start_token_id": null,
|
22 |
-
"diversity_penalty": 0.0,
|
23 |
-
"do_sample": false,
|
24 |
-
"dropout": 0.0,
|
25 |
-
"early_stopping": false,
|
26 |
-
"encoder_no_repeat_ngram_size": 0,
|
27 |
-
"eos_token_id": 2,
|
28 |
-
"exponential_decay_length_penalty": null,
|
29 |
-
"finetuning_task": null,
|
30 |
-
"forced_bos_token_id": null,
|
31 |
-
"forced_eos_token_id": null,
|
32 |
-
"hidden_act": "quick_gelu",
|
33 |
-
"hidden_size": 768,
|
34 |
-
"id2label": {
|
35 |
-
"0": "LABEL_0",
|
36 |
-
"1": "LABEL_1"
|
37 |
-
},
|
38 |
-
"initializer_factor": 1.0,
|
39 |
-
"initializer_range": 0.02,
|
40 |
-
"intermediate_size": 3072,
|
41 |
-
"is_decoder": false,
|
42 |
-
"is_encoder_decoder": false,
|
43 |
-
"label2id": {
|
44 |
-
"LABEL_0": 0,
|
45 |
-
"LABEL_1": 1
|
46 |
-
},
|
47 |
-
"layer_norm_eps": 1e-05,
|
48 |
-
"length_penalty": 1.0,
|
49 |
-
"max_length": 20,
|
50 |
-
"max_position_embeddings": 77,
|
51 |
-
"min_length": 0,
|
52 |
-
"model_type": "clip_text_model",
|
53 |
-
"no_repeat_ngram_size": 0,
|
54 |
-
"num_attention_heads": 12,
|
55 |
-
"num_beam_groups": 1,
|
56 |
-
"num_beams": 1,
|
57 |
-
"num_hidden_layers": 12,
|
58 |
-
"num_return_sequences": 1,
|
59 |
-
"output_attentions": false,
|
60 |
-
"output_hidden_states": false,
|
61 |
-
"output_scores": false,
|
62 |
-
"pad_token_id": 1,
|
63 |
-
"prefix": null,
|
64 |
-
"problem_type": null,
|
65 |
-
"pruned_heads": {},
|
66 |
-
"remove_invalid_values": false,
|
67 |
-
"repetition_penalty": 1.0,
|
68 |
-
"return_dict": true,
|
69 |
-
"return_dict_in_generate": false,
|
70 |
-
"sep_token_id": null,
|
71 |
-
"suppress_tokens": null,
|
72 |
-
"task_specific_params": null,
|
73 |
-
"temperature": 1.0,
|
74 |
-
"tf_legacy_loss": false,
|
75 |
-
"tie_encoder_decoder": false,
|
76 |
-
"tie_word_embeddings": true,
|
77 |
-
"tokenizer_class": null,
|
78 |
-
"top_k": 50,
|
79 |
-
"top_p": 1.0,
|
80 |
-
"torch_dtype": null,
|
81 |
-
"torchscript": false,
|
82 |
-
"transformers_version": "4.23.1",
|
83 |
-
"typical_p": 1.0,
|
84 |
-
"use_bfloat16": false,
|
85 |
-
"vocab_size": 49408
|
86 |
-
},
|
87 |
-
"text_config_dict": {
|
88 |
-
"hidden_size": 768,
|
89 |
-
"intermediate_size": 3072,
|
90 |
-
"num_attention_heads": 12,
|
91 |
-
"num_hidden_layers": 12
|
92 |
-
},
|
93 |
-
"torch_dtype": "float32",
|
94 |
-
"transformers_version": null,
|
95 |
-
"vision_config": {
|
96 |
-
"_name_or_path": "",
|
97 |
-
"add_cross_attention": false,
|
98 |
-
"architectures": null,
|
99 |
-
"attention_dropout": 0.0,
|
100 |
-
"bad_words_ids": null,
|
101 |
-
"begin_suppress_tokens": null,
|
102 |
-
"bos_token_id": null,
|
103 |
-
"chunk_size_feed_forward": 0,
|
104 |
-
"cross_attention_hidden_size": null,
|
105 |
-
"decoder_start_token_id": null,
|
106 |
-
"diversity_penalty": 0.0,
|
107 |
-
"do_sample": false,
|
108 |
-
"dropout": 0.0,
|
109 |
-
"early_stopping": false,
|
110 |
-
"encoder_no_repeat_ngram_size": 0,
|
111 |
-
"eos_token_id": null,
|
112 |
-
"exponential_decay_length_penalty": null,
|
113 |
-
"finetuning_task": null,
|
114 |
-
"forced_bos_token_id": null,
|
115 |
-
"forced_eos_token_id": null,
|
116 |
-
"hidden_act": "quick_gelu",
|
117 |
-
"hidden_size": 1024,
|
118 |
-
"id2label": {
|
119 |
-
"0": "LABEL_0",
|
120 |
-
"1": "LABEL_1"
|
121 |
-
},
|
122 |
-
"image_size": 224,
|
123 |
-
"initializer_factor": 1.0,
|
124 |
-
"initializer_range": 0.02,
|
125 |
-
"intermediate_size": 4096,
|
126 |
-
"is_decoder": false,
|
127 |
-
"is_encoder_decoder": false,
|
128 |
-
"label2id": {
|
129 |
-
"LABEL_0": 0,
|
130 |
-
"LABEL_1": 1
|
131 |
-
},
|
132 |
-
"layer_norm_eps": 1e-05,
|
133 |
-
"length_penalty": 1.0,
|
134 |
-
"max_length": 20,
|
135 |
-
"min_length": 0,
|
136 |
-
"model_type": "clip_vision_model",
|
137 |
-
"no_repeat_ngram_size": 0,
|
138 |
-
"num_attention_heads": 16,
|
139 |
-
"num_beam_groups": 1,
|
140 |
-
"num_beams": 1,
|
141 |
-
"num_channels": 3,
|
142 |
-
"num_hidden_layers": 24,
|
143 |
-
"num_return_sequences": 1,
|
144 |
-
"output_attentions": false,
|
145 |
-
"output_hidden_states": false,
|
146 |
-
"output_scores": false,
|
147 |
-
"pad_token_id": null,
|
148 |
-
"patch_size": 14,
|
149 |
-
"prefix": null,
|
150 |
-
"problem_type": null,
|
151 |
-
"pruned_heads": {},
|
152 |
-
"remove_invalid_values": false,
|
153 |
-
"repetition_penalty": 1.0,
|
154 |
-
"return_dict": true,
|
155 |
-
"return_dict_in_generate": false,
|
156 |
-
"sep_token_id": null,
|
157 |
-
"suppress_tokens": null,
|
158 |
-
"task_specific_params": null,
|
159 |
-
"temperature": 1.0,
|
160 |
-
"tf_legacy_loss": false,
|
161 |
-
"tie_encoder_decoder": false,
|
162 |
-
"tie_word_embeddings": true,
|
163 |
-
"tokenizer_class": null,
|
164 |
-
"top_k": 50,
|
165 |
-
"top_p": 1.0,
|
166 |
-
"torch_dtype": null,
|
167 |
-
"torchscript": false,
|
168 |
-
"transformers_version": "4.23.1",
|
169 |
-
"typical_p": 1.0,
|
170 |
-
"use_bfloat16": false
|
171 |
-
},
|
172 |
-
"vision_config_dict": {
|
173 |
-
"hidden_size": 1024,
|
174 |
-
"intermediate_size": 4096,
|
175 |
-
"num_attention_heads": 16,
|
176 |
-
"num_hidden_layers": 24,
|
177 |
-
"patch_size": 14
|
178 |
-
}
|
179 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
safety_checker/pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:193490b58ef62739077262e833bf091c66c29488058681ac25cf7df3d8190974
|
3 |
-
size 1216061799
|
|
|
|
|
|
|
|
scheduler/scheduler_config.json
DELETED
@@ -1,13 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "PNDMScheduler",
|
3 |
-
"_diffusers_version": "0.6.0",
|
4 |
-
"beta_end": 0.012,
|
5 |
-
"beta_schedule": "scaled_linear",
|
6 |
-
"beta_start": 0.00085,
|
7 |
-
"clip_sample": false,
|
8 |
-
"num_train_timesteps": 1000,
|
9 |
-
"set_alpha_to_one": false,
|
10 |
-
"skip_prk_steps": true,
|
11 |
-
"steps_offset": 1,
|
12 |
-
"trained_betas": null
|
13 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
text_encoder/config.json
DELETED
@@ -1,25 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_name_or_path": "openai/clip-vit-large-patch14",
|
3 |
-
"architectures": [
|
4 |
-
"CLIPTextModel"
|
5 |
-
],
|
6 |
-
"attention_dropout": 0.0,
|
7 |
-
"bos_token_id": 0,
|
8 |
-
"dropout": 0.0,
|
9 |
-
"eos_token_id": 2,
|
10 |
-
"hidden_act": "quick_gelu",
|
11 |
-
"hidden_size": 768,
|
12 |
-
"initializer_factor": 1.0,
|
13 |
-
"initializer_range": 0.02,
|
14 |
-
"intermediate_size": 3072,
|
15 |
-
"layer_norm_eps": 1e-05,
|
16 |
-
"max_position_embeddings": 77,
|
17 |
-
"model_type": "clip_text_model",
|
18 |
-
"num_attention_heads": 12,
|
19 |
-
"num_hidden_layers": 12,
|
20 |
-
"pad_token_id": 1,
|
21 |
-
"projection_dim": 768,
|
22 |
-
"torch_dtype": "float32",
|
23 |
-
"transformers_version": "4.23.1",
|
24 |
-
"vocab_size": 49408
|
25 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
text_encoder/pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:3cc329c12fd921cda5679c859ad7025b1f3e9e492aa3cc3e87994de5287d5bd2
|
3 |
-
size 492305335
|
|
|
|
|
|
|
|
tokenizer/merges.txt
DELETED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer/special_tokens_map.json
DELETED
@@ -1,24 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"bos_token": {
|
3 |
-
"content": "<|startoftext|>",
|
4 |
-
"lstrip": false,
|
5 |
-
"normalized": true,
|
6 |
-
"rstrip": false,
|
7 |
-
"single_word": false
|
8 |
-
},
|
9 |
-
"eos_token": {
|
10 |
-
"content": "<|endoftext|>",
|
11 |
-
"lstrip": false,
|
12 |
-
"normalized": true,
|
13 |
-
"rstrip": false,
|
14 |
-
"single_word": false
|
15 |
-
},
|
16 |
-
"pad_token": "<|endoftext|>",
|
17 |
-
"unk_token": {
|
18 |
-
"content": "<|endoftext|>",
|
19 |
-
"lstrip": false,
|
20 |
-
"normalized": true,
|
21 |
-
"rstrip": false,
|
22 |
-
"single_word": false
|
23 |
-
}
|
24 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
tokenizer/tokenizer_config.json
DELETED
@@ -1,34 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"add_prefix_space": false,
|
3 |
-
"bos_token": {
|
4 |
-
"__type": "AddedToken",
|
5 |
-
"content": "<|startoftext|>",
|
6 |
-
"lstrip": false,
|
7 |
-
"normalized": true,
|
8 |
-
"rstrip": false,
|
9 |
-
"single_word": false
|
10 |
-
},
|
11 |
-
"do_lower_case": true,
|
12 |
-
"eos_token": {
|
13 |
-
"__type": "AddedToken",
|
14 |
-
"content": "<|endoftext|>",
|
15 |
-
"lstrip": false,
|
16 |
-
"normalized": true,
|
17 |
-
"rstrip": false,
|
18 |
-
"single_word": false
|
19 |
-
},
|
20 |
-
"errors": "replace",
|
21 |
-
"model_max_length": 77,
|
22 |
-
"name_or_path": "openai/clip-vit-large-patch14",
|
23 |
-
"pad_token": "<|endoftext|>",
|
24 |
-
"special_tokens_map_file": "./special_tokens_map.json",
|
25 |
-
"tokenizer_class": "CLIPTokenizer",
|
26 |
-
"unk_token": {
|
27 |
-
"__type": "AddedToken",
|
28 |
-
"content": "<|endoftext|>",
|
29 |
-
"lstrip": false,
|
30 |
-
"normalized": true,
|
31 |
-
"rstrip": false,
|
32 |
-
"single_word": false
|
33 |
-
}
|
34 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
tokenizer/vocab.json
DELETED
The diff for this file is too large to render.
See raw diff
|
|
unet/config.json
DELETED
@@ -1,36 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "UNet2DConditionModel",
|
3 |
-
"_diffusers_version": "0.6.0",
|
4 |
-
"act_fn": "silu",
|
5 |
-
"attention_head_dim": 8,
|
6 |
-
"block_out_channels": [
|
7 |
-
320,
|
8 |
-
640,
|
9 |
-
1280,
|
10 |
-
1280
|
11 |
-
],
|
12 |
-
"center_input_sample": false,
|
13 |
-
"cross_attention_dim": 768,
|
14 |
-
"down_block_types": [
|
15 |
-
"CrossAttnDownBlock2D",
|
16 |
-
"CrossAttnDownBlock2D",
|
17 |
-
"CrossAttnDownBlock2D",
|
18 |
-
"DownBlock2D"
|
19 |
-
],
|
20 |
-
"downsample_padding": 1,
|
21 |
-
"flip_sin_to_cos": true,
|
22 |
-
"freq_shift": 0,
|
23 |
-
"in_channels": 4,
|
24 |
-
"layers_per_block": 2,
|
25 |
-
"mid_block_scale_factor": 1,
|
26 |
-
"norm_eps": 1e-05,
|
27 |
-
"norm_num_groups": 32,
|
28 |
-
"out_channels": 4,
|
29 |
-
"sample_size": 64,
|
30 |
-
"up_block_types": [
|
31 |
-
"UpBlock2D",
|
32 |
-
"CrossAttnUpBlock2D",
|
33 |
-
"CrossAttnUpBlock2D",
|
34 |
-
"CrossAttnUpBlock2D"
|
35 |
-
]
|
36 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
unet/diffusion_pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:aaa80770bdbdc88da6dc6bbfb82120686a36796d6b73ee4d32f5622c2d6682d9
|
3 |
-
size 3438354725
|
|
|
|
|
|
|
|
vae/config.json
DELETED
@@ -1,30 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "AutoencoderKL",
|
3 |
-
"_diffusers_version": "0.6.0",
|
4 |
-
"act_fn": "silu",
|
5 |
-
"block_out_channels": [
|
6 |
-
128,
|
7 |
-
256,
|
8 |
-
512,
|
9 |
-
512
|
10 |
-
],
|
11 |
-
"down_block_types": [
|
12 |
-
"DownEncoderBlock2D",
|
13 |
-
"DownEncoderBlock2D",
|
14 |
-
"DownEncoderBlock2D",
|
15 |
-
"DownEncoderBlock2D"
|
16 |
-
],
|
17 |
-
"in_channels": 3,
|
18 |
-
"latent_channels": 4,
|
19 |
-
"layers_per_block": 2,
|
20 |
-
"norm_num_groups": 32,
|
21 |
-
"out_channels": 3,
|
22 |
-
"sample_size": 256,
|
23 |
-
"scaling_factor": 0.18215,
|
24 |
-
"up_block_types": [
|
25 |
-
"UpDecoderBlock2D",
|
26 |
-
"UpDecoderBlock2D",
|
27 |
-
"UpDecoderBlock2D",
|
28 |
-
"UpDecoderBlock2D"
|
29 |
-
]
|
30 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
vae/diffusion_pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:b8cf5b49d164db18a485d392b2d9a9b4e3636d70613cb756d2e1bc460dd13161
|
3 |
-
size 334707217
|
|
|
|
|
|
|
|