Spaces:
Build error
Build error
Commit
·
1ce9f56
1
Parent(s):
ac420fc
Update Home.py
Browse files
Home.py
CHANGED
|
@@ -1,6 +1,7 @@
|
|
| 1 |
import streamlit as st
|
| 2 |
|
| 3 |
st.set_page_config(layout='wide')
|
|
|
|
| 4 |
|
| 5 |
st.title('About')
|
| 6 |
|
|
@@ -11,10 +12,14 @@ In the year 2020, Vision Transformers (ViT) was introduced as a Transformer mode
|
|
| 11 |
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
|
| 12 |
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
|
| 13 |
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
|
|
|
|
| 14 |
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
|
| 15 |
|
| 16 |
The project aims to interpret the ConvNeXt model by several visualization techniques.
|
| 17 |
-
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions
|
|
|
|
|
|
|
|
|
|
| 18 |
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
|
| 19 |
|
| 20 |
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
|
|
@@ -25,9 +30,11 @@ st.write(intro_text)
|
|
| 25 |
|
| 26 |
# 4 PAGES
|
| 27 |
sections_text = """Overall, there are 4 functionalities in this web app:
|
| 28 |
-
1)
|
| 29 |
-
2)
|
| 30 |
-
3)
|
| 31 |
-
4)
|
| 32 |
"""
|
| 33 |
-
st.write(sections_text)
|
|
|
|
|
|
|
|
|
| 1 |
import streamlit as st
|
| 2 |
|
| 3 |
st.set_page_config(layout='wide')
|
| 4 |
+
# st.set_page_config(layout='centered')
|
| 5 |
|
| 6 |
st.title('About')
|
| 7 |
|
|
|
|
| 12 |
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
|
| 13 |
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
|
| 14 |
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
|
| 15 |
+
|
| 16 |
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
|
| 17 |
|
| 18 |
The project aims to interpret the ConvNeXt model by several visualization techniques.
|
| 19 |
+
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions:
|
| 20 |
+
> “What patterns maximally activated this filter (channel) in this layer?”\n
|
| 21 |
+
> “Which features are responsible for the current prediction?”.
|
| 22 |
+
|
| 23 |
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
|
| 24 |
|
| 25 |
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
|
|
|
|
| 30 |
|
| 31 |
# 4 PAGES
|
| 32 |
sections_text = """Overall, there are 4 functionalities in this web app:
|
| 33 |
+
1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”.
|
| 34 |
+
2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”.
|
| 35 |
+
3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation?
|
| 36 |
+
4) ImageNet1k: The storage of 50,000 images in validation set.
|
| 37 |
"""
|
| 38 |
+
st.write(sections_text)
|
| 39 |
+
|
| 40 |
+
|