Update README.md
Browse files
README.md
CHANGED
@@ -36,19 +36,19 @@ To enhance the OCR capability of the model, we have incorporated additional OCR
|
|
36 |
### Vision Foundation model
|
37 |
| Model | Date | Download | Note |
|
38 |
| ----------------------- | ---------- | ---------------------------------------------------------------------- | -------------------------------- |
|
39 |
-
| InternViT-6B-448px-V1
|
40 |
-
| InternViT-6B-448px-V1
|
41 |
-
| InternViT-6B-448px-V1
|
42 |
| InternViT-6B-224px | 2023.12.22 | π€ [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-224px) | vision foundation model |
|
43 |
| InternVL-14B-224px | 2023.12.22 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-14B-224px) | vision-language foundation model |
|
44 |
|
45 |
### Multimodal Large Language Model (MLLM)
|
46 |
| Model | Date | Download | Note |
|
47 |
| ----------------------- | ---------- | --------------------------------------------------------------------------- | ---------------------------------- |
|
48 |
-
| InternVL-Chat-V1
|
49 |
-
| InternVL-Chat-V1
|
50 |
-
| InternVL-Chat-V1
|
51 |
-
| InternVL-Chat-V1
|
52 |
|
53 |
|
54 |
## Model Usage (Image Embeddings)
|
|
|
36 |
### Vision Foundation model
|
37 |
| Model | Date | Download | Note |
|
38 |
| ----------------------- | ---------- | ---------------------------------------------------------------------- | -------------------------------- |
|
39 |
+
| InternViT-6B-448px-V1-5 | 2024.04.20 | π€ [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5) | support dynamic resolution, super strong OCR (π₯new) |
|
40 |
+
| InternViT-6B-448px-V1-2 | 2024.02.11 | π€ [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2) | 448 resolution |
|
41 |
+
| InternViT-6B-448px-V1-0 | 2024.01.30 | π€ [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0) | 448 resolution |
|
42 |
| InternViT-6B-224px | 2023.12.22 | π€ [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-224px) | vision foundation model |
|
43 |
| InternVL-14B-224px | 2023.12.22 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-14B-224px) | vision-language foundation model |
|
44 |
|
45 |
### Multimodal Large Language Model (MLLM)
|
46 |
| Model | Date | Download | Note |
|
47 |
| ----------------------- | ---------- | --------------------------------------------------------------------------- | ---------------------------------- |
|
48 |
+
| InternVL-Chat-V1-5 | 2024.04.18 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (π₯new)|
|
49 |
+
| InternVL-Chat-V1-2-Plus | 2024.02.21 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus) | more SFT data and stronger |
|
50 |
+
| InternVL-Chat-V1-2 | 2024.02.11 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2) | scaling up LLM to 34B |
|
51 |
+
| InternVL-Chat-V1-1 | 2024.01.24 | π€ [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1) | support Chinese and stronger OCR |
|
52 |
|
53 |
|
54 |
## Model Usage (Image Embeddings)
|