Llava-Phi2 / README.md
Siddharth Nahar
Changed .safetensors from Ravi Naik llava-phi2 to marianna llava-phi2
2d28745
metadata
license: mit
datasets:
  - liuhaotian/LLaVA-Instruct-150K
  - liuhaotian/LLaVA-Pretrain
language:
  - en
pipeline_tag: visual-question-answering

Model Card for Model ID

This is a multimodal implementation of Phi2 model inspired by LlaVA-Phi.

Model Details

  1. LLM Backbone: Phi2
  2. Vision Tower: clip-vit-large-patch14-336
  3. Pretraining Dataset: LAION-CC-SBU dataset with BLIP captions(200k samples)
  4. Finetuning Dataset: Instruct 150k dataset based on COCO
  5. Finetuned Model: marianna13/llava-phi-2-3b

Model Sources