Update README.md
Browse files
README.md
CHANGED
@@ -78,22 +78,24 @@ achieve competitive results on 17 benchmarks.
|
|
78 |
|
79 |
## π Inference
|
80 |
|
81 |
-
|
82 |
-
|
|
|
|
|
83 |
|
84 |
-
### CLI
|
85 |
-
See [here](https://github.com/FreedomIntelligence/ALLaVA/tree/main?tab=readme-ov-file#cli) for CLI code snippet.
|
86 |
|
87 |
|
88 |
|
89 |
## ποΈββοΈ Training
|
90 |
|
91 |
### Data
|
92 |
-
|
93 |
<img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
|
94 |
-
</div>
|
95 |
|
96 |
-
ALLaVA uses
|
97 |
|
98 |
|
99 |
### Code
|
@@ -110,7 +112,7 @@ These two models share the same PT procedure. -->
|
|
110 |
### Hyperparameters
|
111 |
|
112 |
| Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
|
113 |
-
| ---: | ---: |--:| ---: | ---: | ---: | ---: |
|
114 |
| 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
|
115 |
|
116 |
The LM backbone, projector are trainable, while the vision encoder is kept frozen.
|
|
|
78 |
|
79 |
## π Inference
|
80 |
|
81 |
+
All models can be loaded from π€ with `.from_pretrained()`.
|
82 |
+
Check out the [example scripts](https://github.com/FreedomIntelligence/ALLaVA/tree/main/allava/serve) and make sure you have the same outputs as shown in the scripts.
|
83 |
+
<!-- ### Load from π€ (Recommended)
|
84 |
+
See the [example script](https://github.com/FreedomIntelligence/ALLaVA/blob/main/allava/serve/huggingface_inference.py). -->
|
85 |
|
86 |
+
<!-- ### CLI
|
87 |
+
See [here](https://github.com/FreedomIntelligence/ALLaVA/tree/main?tab=readme-ov-file#cli) for CLI code snippet. -->
|
88 |
|
89 |
|
90 |
|
91 |
## ποΈββοΈ Training
|
92 |
|
93 |
### Data
|
94 |
+
<div align=center>
|
95 |
<img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
|
96 |
+
</div>
|
97 |
|
98 |
+
ALLaVA uses 1.0M and 1.5M data for PT. and FT., respectively.
|
99 |
|
100 |
|
101 |
### Code
|
|
|
112 |
### Hyperparameters
|
113 |
|
114 |
| Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
|
115 |
+
| ---: | ---: |--:| ---: | ---: | ---: | ---: |
|
116 |
| 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
|
117 |
|
118 |
The LM backbone, projector are trainable, while the vision encoder is kept frozen.
|