feature/diffusers-model

#3
by ayan4m1 - opened
README.md CHANGED
@@ -1,81 +1,17 @@
1
  ---
2
- inference: true
3
  tags:
4
  - stable-diffusion
5
  - stable-diffusion-diffusers
6
  - text-to-image
7
- license: creativeml-openrail-m
8
  ---
9
 
10
  ## Please Note!
11
 
12
- This model is NOT the 19.2M images Characters Model on TrinArt, but an improved version of the original Trin-sama Twitter bot model. This model is intended to retain the original SD's aesthetics as much as possible while nudging the model to anime/manga style.
13
-
14
- Other TrinArt models can be found at:
15
-
16
- https://huggingface.co/naclbit/trinart_derrida_characters_v2_stable_diffusion
17
-
18
- https://huggingface.co/naclbit/trinart_characters_19.2m_stable_diffusion_v1
19
-
20
-
21
- ## Diffusers
22
-
23
- The model has been ported to `diffusers` by [ayan4m1](https://huggingface.co/ayan4m1)
24
- and can easily be run from one of the branches:
25
- - `revision="diffusers-60k"` for the checkpoint trained on 60,000 steps,
26
- - `revision="diffusers-95k"` for the checkpoint trained on 95,000 steps,
27
- - `revision="diffusers-115k"` for the checkpoint trained on 115,000 steps.
28
-
29
- For more information, please have a look at [the "Three flavors" section](#three-flavors).
30
-
31
- ## Gradio
32
-
33
- We also support a [Gradio](https://github.com/gradio-app/gradio) web ui with diffusers to run inside a colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1RWvik_C7nViiR9bNsu3fvMR3STx6RvDx?usp=sharing)
34
-
35
-
36
- ### Example Text2Image
37
-
38
- ```python
39
- # !pip install diffusers==0.3.0
40
- from diffusers import StableDiffusionPipeline
41
-
42
- # using the 60,000 steps checkpoint
43
- pipe = StableDiffusionPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-60k")
44
- pipe.to("cuda")
45
-
46
- image = pipe("A magical dragon flying in front of the Himalaya in manga style").images[0]
47
- image
48
- ```
49
-
50
- ![dragon](https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/a_magical_dragon_himalaya.png)
51
-
52
- If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
53
-
54
- ### Example Image2Image
55
-
56
- ```python
57
- # !pip install diffusers==0.3.0
58
- from diffusers import StableDiffusionImg2ImgPipeline
59
- import requests
60
- from PIL import Image
61
- from io import BytesIO
62
-
63
- url = "https://scitechdaily.com/images/Dog-Park.jpg"
64
-
65
- response = requests.get(url)
66
- init_image = Image.open(BytesIO(response.content)).convert("RGB")
67
- init_image = init_image.resize((768, 512))
68
-
69
- # using the 115,000 steps checkpoint
70
- pipe = StableDiffusionImg2ImgPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-115k")
71
- pipe.to("cuda")
72
-
73
- images = pipe(prompt="Manga drawing of Brad Pitt", init_image=init_image, strength=0.75, guidance_scale=7.5).images
74
- image
75
- ```
76
-
77
- If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
78
 
 
79
 
80
  ## Stable Diffusion TrinArt/Trin-sama AI finetune v2
81
 
@@ -126,4 +62,4 @@ Each images were diffused using K. Crowson's k-lms (from k-diffusion repo) metho
126
 
127
  #### License
128
 
129
- CreativeML OpenRAIL-M
 
1
  ---
2
+ inference: false
3
  tags:
4
  - stable-diffusion
5
  - stable-diffusion-diffusers
6
  - text-to-image
7
+ license: apache-2.0
8
  ---
9
 
10
  ## Please Note!
11
 
12
+ This model is NOT the 19.2M images Characters Model on TrinArt, but an improved version of the original trinsama Twitter bot model. This model is intended to retain the original SD's aesthetics as much as possible while nudging the model to anime/manga style.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ このモデルはTrinArtのキャラクターズモデル(1920万枚再学習モデル)ではありません! とりんさまAIボットのモデルの改良版です。このモデルはオリジナルのSD v1.4モデルのアートスタイルをできる限り残したまま、アニメ・マンガ方向に調整することを意図しています。
15
 
16
  ## Stable Diffusion TrinArt/Trin-sama AI finetune v2
17
 
 
62
 
63
  #### License
64
 
65
+ Apache License 2.0
feature_extractor/preprocessor_config.json DELETED
@@ -1,20 +0,0 @@
1
- {
2
- "crop_size": 224,
3
- "do_center_crop": true,
4
- "do_convert_rgb": true,
5
- "do_normalize": true,
6
- "do_resize": true,
7
- "feature_extractor_type": "CLIPFeatureExtractor",
8
- "image_mean": [
9
- 0.48145466,
10
- 0.4578275,
11
- 0.40821073
12
- ],
13
- "image_std": [
14
- 0.26862954,
15
- 0.26130258,
16
- 0.27577711
17
- ],
18
- "resample": 3,
19
- "size": 224
20
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
model_index.json DELETED
@@ -1,32 +0,0 @@
1
- {
2
- "_class_name": "StableDiffusionPipeline",
3
- "_diffusers_version": "0.6.0",
4
- "feature_extractor": [
5
- "transformers",
6
- "CLIPImageProcessor"
7
- ],
8
- "safety_checker": [
9
- "stable_diffusion",
10
- "StableDiffusionSafetyChecker"
11
- ],
12
- "scheduler": [
13
- "diffusers",
14
- "PNDMScheduler"
15
- ],
16
- "text_encoder": [
17
- "transformers",
18
- "CLIPTextModel"
19
- ],
20
- "tokenizer": [
21
- "transformers",
22
- "CLIPTokenizer"
23
- ],
24
- "unet": [
25
- "diffusers",
26
- "UNet2DConditionModel"
27
- ],
28
- "vae": [
29
- "diffusers",
30
- "AutoencoderKL"
31
- ]
32
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
safety_checker/config.json DELETED
@@ -1,179 +0,0 @@
1
- {
2
- "_commit_hash": "4bb648a606ef040e7685bde262611766a5fdd67b",
3
- "_name_or_path": "CompVis/stable-diffusion-safety-checker",
4
- "architectures": [
5
- "StableDiffusionSafetyChecker"
6
- ],
7
- "initializer_factor": 1.0,
8
- "logit_scale_init_value": 2.6592,
9
- "model_type": "clip",
10
- "projection_dim": 768,
11
- "text_config": {
12
- "_name_or_path": "",
13
- "add_cross_attention": false,
14
- "architectures": null,
15
- "attention_dropout": 0.0,
16
- "bad_words_ids": null,
17
- "begin_suppress_tokens": null,
18
- "bos_token_id": 0,
19
- "chunk_size_feed_forward": 0,
20
- "cross_attention_hidden_size": null,
21
- "decoder_start_token_id": null,
22
- "diversity_penalty": 0.0,
23
- "do_sample": false,
24
- "dropout": 0.0,
25
- "early_stopping": false,
26
- "encoder_no_repeat_ngram_size": 0,
27
- "eos_token_id": 2,
28
- "exponential_decay_length_penalty": null,
29
- "finetuning_task": null,
30
- "forced_bos_token_id": null,
31
- "forced_eos_token_id": null,
32
- "hidden_act": "quick_gelu",
33
- "hidden_size": 768,
34
- "id2label": {
35
- "0": "LABEL_0",
36
- "1": "LABEL_1"
37
- },
38
- "initializer_factor": 1.0,
39
- "initializer_range": 0.02,
40
- "intermediate_size": 3072,
41
- "is_decoder": false,
42
- "is_encoder_decoder": false,
43
- "label2id": {
44
- "LABEL_0": 0,
45
- "LABEL_1": 1
46
- },
47
- "layer_norm_eps": 1e-05,
48
- "length_penalty": 1.0,
49
- "max_length": 20,
50
- "max_position_embeddings": 77,
51
- "min_length": 0,
52
- "model_type": "clip_text_model",
53
- "no_repeat_ngram_size": 0,
54
- "num_attention_heads": 12,
55
- "num_beam_groups": 1,
56
- "num_beams": 1,
57
- "num_hidden_layers": 12,
58
- "num_return_sequences": 1,
59
- "output_attentions": false,
60
- "output_hidden_states": false,
61
- "output_scores": false,
62
- "pad_token_id": 1,
63
- "prefix": null,
64
- "problem_type": null,
65
- "pruned_heads": {},
66
- "remove_invalid_values": false,
67
- "repetition_penalty": 1.0,
68
- "return_dict": true,
69
- "return_dict_in_generate": false,
70
- "sep_token_id": null,
71
- "suppress_tokens": null,
72
- "task_specific_params": null,
73
- "temperature": 1.0,
74
- "tf_legacy_loss": false,
75
- "tie_encoder_decoder": false,
76
- "tie_word_embeddings": true,
77
- "tokenizer_class": null,
78
- "top_k": 50,
79
- "top_p": 1.0,
80
- "torch_dtype": null,
81
- "torchscript": false,
82
- "transformers_version": "4.23.1",
83
- "typical_p": 1.0,
84
- "use_bfloat16": false,
85
- "vocab_size": 49408
86
- },
87
- "text_config_dict": {
88
- "hidden_size": 768,
89
- "intermediate_size": 3072,
90
- "num_attention_heads": 12,
91
- "num_hidden_layers": 12
92
- },
93
- "torch_dtype": "float32",
94
- "transformers_version": null,
95
- "vision_config": {
96
- "_name_or_path": "",
97
- "add_cross_attention": false,
98
- "architectures": null,
99
- "attention_dropout": 0.0,
100
- "bad_words_ids": null,
101
- "begin_suppress_tokens": null,
102
- "bos_token_id": null,
103
- "chunk_size_feed_forward": 0,
104
- "cross_attention_hidden_size": null,
105
- "decoder_start_token_id": null,
106
- "diversity_penalty": 0.0,
107
- "do_sample": false,
108
- "dropout": 0.0,
109
- "early_stopping": false,
110
- "encoder_no_repeat_ngram_size": 0,
111
- "eos_token_id": null,
112
- "exponential_decay_length_penalty": null,
113
- "finetuning_task": null,
114
- "forced_bos_token_id": null,
115
- "forced_eos_token_id": null,
116
- "hidden_act": "quick_gelu",
117
- "hidden_size": 1024,
118
- "id2label": {
119
- "0": "LABEL_0",
120
- "1": "LABEL_1"
121
- },
122
- "image_size": 224,
123
- "initializer_factor": 1.0,
124
- "initializer_range": 0.02,
125
- "intermediate_size": 4096,
126
- "is_decoder": false,
127
- "is_encoder_decoder": false,
128
- "label2id": {
129
- "LABEL_0": 0,
130
- "LABEL_1": 1
131
- },
132
- "layer_norm_eps": 1e-05,
133
- "length_penalty": 1.0,
134
- "max_length": 20,
135
- "min_length": 0,
136
- "model_type": "clip_vision_model",
137
- "no_repeat_ngram_size": 0,
138
- "num_attention_heads": 16,
139
- "num_beam_groups": 1,
140
- "num_beams": 1,
141
- "num_channels": 3,
142
- "num_hidden_layers": 24,
143
- "num_return_sequences": 1,
144
- "output_attentions": false,
145
- "output_hidden_states": false,
146
- "output_scores": false,
147
- "pad_token_id": null,
148
- "patch_size": 14,
149
- "prefix": null,
150
- "problem_type": null,
151
- "pruned_heads": {},
152
- "remove_invalid_values": false,
153
- "repetition_penalty": 1.0,
154
- "return_dict": true,
155
- "return_dict_in_generate": false,
156
- "sep_token_id": null,
157
- "suppress_tokens": null,
158
- "task_specific_params": null,
159
- "temperature": 1.0,
160
- "tf_legacy_loss": false,
161
- "tie_encoder_decoder": false,
162
- "tie_word_embeddings": true,
163
- "tokenizer_class": null,
164
- "top_k": 50,
165
- "top_p": 1.0,
166
- "torch_dtype": null,
167
- "torchscript": false,
168
- "transformers_version": "4.23.1",
169
- "typical_p": 1.0,
170
- "use_bfloat16": false
171
- },
172
- "vision_config_dict": {
173
- "hidden_size": 1024,
174
- "intermediate_size": 4096,
175
- "num_attention_heads": 16,
176
- "num_hidden_layers": 24,
177
- "patch_size": 14
178
- }
179
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
safety_checker/pytorch_model.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:193490b58ef62739077262e833bf091c66c29488058681ac25cf7df3d8190974
3
- size 1216061799
 
 
 
 
scheduler/scheduler_config.json DELETED
@@ -1,13 +0,0 @@
1
- {
2
- "_class_name": "PNDMScheduler",
3
- "_diffusers_version": "0.6.0",
4
- "beta_end": 0.012,
5
- "beta_schedule": "scaled_linear",
6
- "beta_start": 0.00085,
7
- "clip_sample": false,
8
- "num_train_timesteps": 1000,
9
- "set_alpha_to_one": false,
10
- "skip_prk_steps": true,
11
- "steps_offset": 1,
12
- "trained_betas": null
13
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
text_encoder/config.json DELETED
@@ -1,25 +0,0 @@
1
- {
2
- "_name_or_path": "openai/clip-vit-large-patch14",
3
- "architectures": [
4
- "CLIPTextModel"
5
- ],
6
- "attention_dropout": 0.0,
7
- "bos_token_id": 0,
8
- "dropout": 0.0,
9
- "eos_token_id": 2,
10
- "hidden_act": "quick_gelu",
11
- "hidden_size": 768,
12
- "initializer_factor": 1.0,
13
- "initializer_range": 0.02,
14
- "intermediate_size": 3072,
15
- "layer_norm_eps": 1e-05,
16
- "max_position_embeddings": 77,
17
- "model_type": "clip_text_model",
18
- "num_attention_heads": 12,
19
- "num_hidden_layers": 12,
20
- "pad_token_id": 1,
21
- "projection_dim": 768,
22
- "torch_dtype": "float32",
23
- "transformers_version": "4.23.1",
24
- "vocab_size": 49408
25
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
text_encoder/pytorch_model.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:3cc329c12fd921cda5679c859ad7025b1f3e9e492aa3cc3e87994de5287d5bd2
3
- size 492305335
 
 
 
 
tokenizer/merges.txt DELETED
The diff for this file is too large to render. See raw diff
 
tokenizer/special_tokens_map.json DELETED
@@ -1,24 +0,0 @@
1
- {
2
- "bos_token": {
3
- "content": "<|startoftext|>",
4
- "lstrip": false,
5
- "normalized": true,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
- "eos_token": {
10
- "content": "<|endoftext|>",
11
- "lstrip": false,
12
- "normalized": true,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "pad_token": "<|endoftext|>",
17
- "unk_token": {
18
- "content": "<|endoftext|>",
19
- "lstrip": false,
20
- "normalized": true,
21
- "rstrip": false,
22
- "single_word": false
23
- }
24
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer/tokenizer_config.json DELETED
@@ -1,34 +0,0 @@
1
- {
2
- "add_prefix_space": false,
3
- "bos_token": {
4
- "__type": "AddedToken",
5
- "content": "<|startoftext|>",
6
- "lstrip": false,
7
- "normalized": true,
8
- "rstrip": false,
9
- "single_word": false
10
- },
11
- "do_lower_case": true,
12
- "eos_token": {
13
- "__type": "AddedToken",
14
- "content": "<|endoftext|>",
15
- "lstrip": false,
16
- "normalized": true,
17
- "rstrip": false,
18
- "single_word": false
19
- },
20
- "errors": "replace",
21
- "model_max_length": 77,
22
- "name_or_path": "openai/clip-vit-large-patch14",
23
- "pad_token": "<|endoftext|>",
24
- "special_tokens_map_file": "./special_tokens_map.json",
25
- "tokenizer_class": "CLIPTokenizer",
26
- "unk_token": {
27
- "__type": "AddedToken",
28
- "content": "<|endoftext|>",
29
- "lstrip": false,
30
- "normalized": true,
31
- "rstrip": false,
32
- "single_word": false
33
- }
34
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer/vocab.json DELETED
The diff for this file is too large to render. See raw diff
 
unet/config.json DELETED
@@ -1,36 +0,0 @@
1
- {
2
- "_class_name": "UNet2DConditionModel",
3
- "_diffusers_version": "0.6.0",
4
- "act_fn": "silu",
5
- "attention_head_dim": 8,
6
- "block_out_channels": [
7
- 320,
8
- 640,
9
- 1280,
10
- 1280
11
- ],
12
- "center_input_sample": false,
13
- "cross_attention_dim": 768,
14
- "down_block_types": [
15
- "CrossAttnDownBlock2D",
16
- "CrossAttnDownBlock2D",
17
- "CrossAttnDownBlock2D",
18
- "DownBlock2D"
19
- ],
20
- "downsample_padding": 1,
21
- "flip_sin_to_cos": true,
22
- "freq_shift": 0,
23
- "in_channels": 4,
24
- "layers_per_block": 2,
25
- "mid_block_scale_factor": 1,
26
- "norm_eps": 1e-05,
27
- "norm_num_groups": 32,
28
- "out_channels": 4,
29
- "sample_size": 64,
30
- "up_block_types": [
31
- "UpBlock2D",
32
- "CrossAttnUpBlock2D",
33
- "CrossAttnUpBlock2D",
34
- "CrossAttnUpBlock2D"
35
- ]
36
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
unet/diffusion_pytorch_model.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:aaa80770bdbdc88da6dc6bbfb82120686a36796d6b73ee4d32f5622c2d6682d9
3
- size 3438354725
 
 
 
 
vae/config.json DELETED
@@ -1,30 +0,0 @@
1
- {
2
- "_class_name": "AutoencoderKL",
3
- "_diffusers_version": "0.6.0",
4
- "act_fn": "silu",
5
- "block_out_channels": [
6
- 128,
7
- 256,
8
- 512,
9
- 512
10
- ],
11
- "down_block_types": [
12
- "DownEncoderBlock2D",
13
- "DownEncoderBlock2D",
14
- "DownEncoderBlock2D",
15
- "DownEncoderBlock2D"
16
- ],
17
- "in_channels": 3,
18
- "latent_channels": 4,
19
- "layers_per_block": 2,
20
- "norm_num_groups": 32,
21
- "out_channels": 3,
22
- "sample_size": 256,
23
- "scaling_factor": 0.18215,
24
- "up_block_types": [
25
- "UpDecoderBlock2D",
26
- "UpDecoderBlock2D",
27
- "UpDecoderBlock2D",
28
- "UpDecoderBlock2D"
29
- ]
30
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
vae/diffusion_pytorch_model.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:b8cf5b49d164db18a485d392b2d9a9b4e3636d70613cb756d2e1bc460dd13161
3
- size 334707217