Improve model card (#2)
Browse files- Improve model card (140a88138dd9d9c6077e6877b3c3a025204fcb5b)
Co-authored-by: Niels Rogge <[email protected]>
README.md
CHANGED
@@ -1,21 +1,22 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
base_model:
|
4 |
- google/siglip-so400m-patch14-384
|
5 |
-
pipeline_tag: image-classification
|
6 |
language:
|
7 |
- en
|
8 |
- zh
|
|
|
|
|
9 |
---
|
|
|
10 |
# Oryx-ViT
|
11 |
|
12 |
## Model Summary
|
13 |
|
14 |
-
The Oryx-ViT model is trained on 200M data and can seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
|
15 |
|
16 |
- **Repository:** https://github.com/Oryx-mllm/Oryx
|
|
|
17 |
- **Languages:** English, Chinese
|
18 |
-
- **Paper:** https://arxiv.org/abs/2409.12961
|
19 |
|
20 |
|
21 |
### Model Architecture
|
@@ -30,4 +31,13 @@ The Oryx-ViT model is trained on 200M data and can seamlessly and efficiently pr
|
|
30 |
- **Orchestration:** HuggingFace Trainer
|
31 |
- **Code:** Pytorch
|
32 |
|
33 |
-
## Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
2 |
base_model:
|
3 |
- google/siglip-so400m-patch14-384
|
|
|
4 |
language:
|
5 |
- en
|
6 |
- zh
|
7 |
+
license: apache-2.0
|
8 |
+
pipeline_tag: image-feature-extraction
|
9 |
---
|
10 |
+
|
11 |
# Oryx-ViT
|
12 |
|
13 |
## Model Summary
|
14 |
|
15 |
+
The Oryx-ViT model is trained on 200M data and can seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths. It is described in the paper [Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution](https://arxiv.org/abs/2409.12961).
|
16 |
|
17 |
- **Repository:** https://github.com/Oryx-mllm/Oryx
|
18 |
+
- **Project Page:** https://oryx-mllm.github.io
|
19 |
- **Languages:** English, Chinese
|
|
|
20 |
|
21 |
|
22 |
### Model Architecture
|
|
|
31 |
- **Orchestration:** HuggingFace Trainer
|
32 |
- **Code:** Pytorch
|
33 |
|
34 |
+
## Citation
|
35 |
+
|
36 |
+
```bibtex
|
37 |
+
@article{liu2024oryx,
|
38 |
+
title={Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution},
|
39 |
+
author={Liu, Zuyan and Dong, Yuhao and Liu, Ziwei and Hu, Winston and Lu, Jiwen and Rao, Yongming},
|
40 |
+
journal={arXiv preprint arXiv:2409.12961},
|
41 |
+
year={2024}
|
42 |
+
}
|
43 |
+
```
|