--- license: other license_name: nvclv1 license_link: LICENSE datasets: - ILSVRC/imagenet-1k pipeline_tag: image-classification --- [**MambaVision: A Hybrid Mamba-Transformer Vision Backbone**](https://arxiv.org/abs/2407.08083). ### Model Overview We introduce a novel mixer block by creating a symmetric path without SSM to enhance the modeling of global context. MambaVision has a hierarchical architecture that employs both self-attention and mixer blocks. ### Model Performance MambaVision demonstrates a strong performance by achieving a new SOTA Pareto-front in terms of Top-1 accuracy and throughput.
### Model Usage You must first login into HuggingFace to pull the model: ```Bash huggingface-cli login ``` The model can be simply used according to: ```Python access_token = "