add metadata
Browse files
README.md
CHANGED
@@ -76,6 +76,15 @@ We measure the inference time of SDM-v2.0 with 768x768 resolution and the other
|
|
76 |
- **Self-Attention-Based Knowledge Distillation**: The core technique in KOALA focuses on the distillation of self-attention features, which proves crucial for maintaining image generation quality.
|
77 |
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
## Usage with 🤗[Diffusers library](https://github.com/huggingface/diffusers)
|
80 |
The inference code with denoising step 25
|
81 |
```python
|
@@ -90,11 +99,31 @@ negative = "worst quality, low quality, illustration, low resolution"
|
|
90 |
image = pipe(prompt=prompt, negative_prompt=negative).images[0]
|
91 |
```
|
92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
## Limitations and Bias
|
94 |
- Text Rendering: The models face challenges in rendering long, legible text within images.
|
95 |
- Complex Prompts: KOALA sometimes struggles with complex prompts involving multiple attributes.
|
96 |
- Dataset Dependencies: The current limitations are partially attributed to the characteristics of the training dataset (LAION-aesthetics-V2 6+).
|
97 |
|
|
|
|
|
98 |
## Citation
|
99 |
```bibtex
|
100 |
@misc{Lee@koala,
|
|
|
76 |
- **Self-Attention-Based Knowledge Distillation**: The core technique in KOALA focuses on the distillation of self-attention features, which proves crucial for maintaining image generation quality.
|
77 |
|
78 |
|
79 |
+
|
80 |
+
## Model Description
|
81 |
+
|
82 |
+
- Developed by [ETRI Visual Intelligence Lab](https://huggingface.co/etri-vilab)
|
83 |
+
- Developer: [Youngwan Lee](https://youngwanlee.github.io/), [Kwanyong Park](https://pkyong95.github.io/), [Yoorhim Cho](https://ofzlo.github.io/), [Young-Ju Lee](https://scholar.google.com/citations?user=6goOQh8AAAAJ&hl=en), [Sung Ju Hwang](http://www.sungjuhwang.com/)
|
84 |
+
- Model Description: Latent Diffusion based text-to-image generative model. KOALA models uses the same text encoders as [SDXL-Base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and only replace the denoising U-Net with the compressed U-Nets.
|
85 |
+
- Resources for more information: Check out [KOALA report on arXiv](https://arxiv.org/abs/2312.04005) and [project page](https://youngwanlee.github.io/KOALA/).
|
86 |
+
|
87 |
+
|
88 |
## Usage with 🤗[Diffusers library](https://github.com/huggingface/diffusers)
|
89 |
The inference code with denoising step 25
|
90 |
```python
|
|
|
99 |
image = pipe(prompt=prompt, negative_prompt=negative).images[0]
|
100 |
```
|
101 |
|
102 |
+
|
103 |
+
|
104 |
+
## Uses
|
105 |
+
### Direct Use
|
106 |
+
The model is intended for research purposes only. Possible research areas and tasks include
|
107 |
+
|
108 |
+
- Generation of artworks and use in design and other artistic processes.
|
109 |
+
- Applications in educational or creative tools.
|
110 |
+
- Research on generative models.
|
111 |
+
- Safe deployment of models which have the potential to generate harmful content.
|
112 |
+
- Probing and understanding the limitations and biases of generative models.
|
113 |
+
- Excluded uses are described below.
|
114 |
+
|
115 |
+
### Out-of-Scope Use
|
116 |
+
|
117 |
+
The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
|
118 |
+
|
119 |
+
|
120 |
## Limitations and Bias
|
121 |
- Text Rendering: The models face challenges in rendering long, legible text within images.
|
122 |
- Complex Prompts: KOALA sometimes struggles with complex prompts involving multiple attributes.
|
123 |
- Dataset Dependencies: The current limitations are partially attributed to the characteristics of the training dataset (LAION-aesthetics-V2 6+).
|
124 |
|
125 |
+
|
126 |
+
|
127 |
## Citation
|
128 |
```bibtex
|
129 |
@misc{Lee@koala,
|