lamm-mit
/

cephalo

mjbuehler commited on May 28, 2024

Commit

8d9bfae

verified ·

1 Parent(s): cfec9be

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -18,6 +18,8 @@ Cephalo provides a robust framework for multimodal interaction and understanding
 ## Overview of Models:
 - [Cephalo-Phi-3-vision-128k-4b-alpha](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha)
   - Base version of the Cephalo-Phi-3 model, trained on GPT-4o distilled image-text data from Wikipedia and scientific papers. Good baseline model, but struggles in longer conversations. Context length of 128,000 tokens.
 - [Cephalo-Phi-3-vision-128k-4b-beta](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-beta)
@@ -29,6 +31,13 @@ Cephalo provides a robust framework for multimodal interaction and understanding
 - [Cephalo-Llava-v1.6-Mistral-8b-alpha](https://huggingface.co/lamm-mit/Cephalo-Llava-v1.6-Mistral-8b-alpha)
   - Trained on GPT-4o distilled image-text data from Wikipedia, with low-resolution images. Does not perform well on multiple image queries, and has some inconsistencies in understanding.
 ## Citation
 Please cite as:

 ## Overview of Models:
+### 8b and 4b models
 - [Cephalo-Phi-3-vision-128k-4b-alpha](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha)
   - Base version of the Cephalo-Phi-3 model, trained on GPT-4o distilled image-text data from Wikipedia and scientific papers. Good baseline model, but struggles in longer conversations. Context length of 128,000 tokens.
 - [Cephalo-Phi-3-vision-128k-4b-beta](https://huggingface.co/lamm-mit/Cephalo-Phi-3-vision-128k-4b-beta)
 - [Cephalo-Llava-v1.6-Mistral-8b-alpha](https://huggingface.co/lamm-mit/Cephalo-Llava-v1.6-Mistral-8b-alpha)
   - Trained on GPT-4o distilled image-text data from Wikipedia, with low-resolution images. Does not perform well on multiple image queries, and has some inconsistencies in understanding.
+### Merged models
+- [Cephalo-Idefics-2-vision-10b-alpha](https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-10b-alpha)
+  - Merged model, 32+8=40 layers, checkpoint after first epoch. Trained on GPT-4o distilled image-text data from Wikipedia and scientific papers.
+- [Cephalo-Idefics-2-vision-10b-beta](https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-10b-beta)
+  - Merged model, 32+8=40 layers, checkpoint after second epoch. Trained on GPT-4o distilled image-text data from Wikipedia and scientific papers.
 ## Citation
 Please cite as: