Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,8 @@ Cephalo provides a robust framework for multimodal interaction and understanding
|
|
18 |
|
19 |
## Overview of Models:
|
20 |
|
|
|
|
|
21 |
- [Cephalo-Phi-3-vision-128k-4b-alpha](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha)
|
22 |
- Base version of the Cephalo-Phi-3 model, trained on GPT-4o distilled image-text data from Wikipedia and scientific papers. Good baseline model, but struggles in longer conversations. Context length of 128,000 tokens.
|
23 |
- [Cephalo-Phi-3-vision-128k-4b-beta](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-beta)
|
@@ -29,6 +31,13 @@ Cephalo provides a robust framework for multimodal interaction and understanding
|
|
29 |
- [Cephalo-Llava-v1.6-Mistral-8b-alpha](https://huggingface.co/lamm-mit/Cephalo-Llava-v1.6-Mistral-8b-alpha)
|
30 |
- Trained on GPT-4o distilled image-text data from Wikipedia, with low-resolution images. Does not perform well on multiple image queries, and has some inconsistencies in understanding.
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
## Citation
|
33 |
|
34 |
Please cite as:
|
|
|
18 |
|
19 |
## Overview of Models:
|
20 |
|
21 |
+
|
22 |
+
### 8b and 4b models
|
23 |
- [Cephalo-Phi-3-vision-128k-4b-alpha](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha)
|
24 |
- Base version of the Cephalo-Phi-3 model, trained on GPT-4o distilled image-text data from Wikipedia and scientific papers. Good baseline model, but struggles in longer conversations. Context length of 128,000 tokens.
|
25 |
- [Cephalo-Phi-3-vision-128k-4b-beta](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-beta)
|
|
|
31 |
- [Cephalo-Llava-v1.6-Mistral-8b-alpha](https://huggingface.co/lamm-mit/Cephalo-Llava-v1.6-Mistral-8b-alpha)
|
32 |
- Trained on GPT-4o distilled image-text data from Wikipedia, with low-resolution images. Does not perform well on multiple image queries, and has some inconsistencies in understanding.
|
33 |
|
34 |
+
### Merged models
|
35 |
+
|
36 |
+
- [Cephalo-Idefics-2-vision-10b-alpha](https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-10b-alpha)
|
37 |
+
- Merged model, 32+8=40 layers, checkpoint after first epoch. Trained on GPT-4o distilled image-text data from Wikipedia and scientific papers.
|
38 |
+
- [Cephalo-Idefics-2-vision-10b-beta](https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-10b-beta)
|
39 |
+
- Merged model, 32+8=40 layers, checkpoint after second epoch. Trained on GPT-4o distilled image-text data from Wikipedia and scientific papers.
|
40 |
+
|
41 |
## Citation
|
42 |
|
43 |
Please cite as:
|