kulia-moon commited on
Commit
f82d794
·
verified ·
1 Parent(s): 09bcd43

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -8,9 +8,8 @@ tags:
8
  - research
9
  - pytorch
10
  - vlm
11
- base_model:
12
- # - HuggingFaceTB/SmolLM2-135M
13
- - lusxvr/nanoVLM-222M
14
  ---
15
 
16
  **nanoVLM** is a minimal and lightweight Vision-Language Model (VLM) designed for efficient training and experimentation. Built using pure PyTorch, the entire model architecture and training logic fits within ~750 lines of code. It combines a ViT-based image encoder (SigLIP-B/16-224-85M) with a lightweight causal language model (SmolLM2-135M), resulting in a compact 222M parameter model.
@@ -26,4 +25,7 @@ Follow the install instructions and run the following code:
26
  from models.vision_language_model import VisionLanguageModel
27
 
28
  model = VisionLanguageModel.from_pretrained("kulia-moon/jasVLM-nanoVLM")
29
- ```
 
 
 
 
8
  - research
9
  - pytorch
10
  - vlm
11
+ datasets:
12
+ - HuggingFaceM4/the_cauldron
 
13
  ---
14
 
15
  **nanoVLM** is a minimal and lightweight Vision-Language Model (VLM) designed for efficient training and experimentation. Built using pure PyTorch, the entire model architecture and training logic fits within ~750 lines of code. It combines a ViT-based image encoder (SigLIP-B/16-224-85M) with a lightweight causal language model (SmolLM2-135M), resulting in a compact 222M parameter model.
 
25
  from models.vision_language_model import VisionLanguageModel
26
 
27
  model = VisionLanguageModel.from_pretrained("kulia-moon/jasVLM-nanoVLM")
28
+ ```
29
+ # Evaluation
30
+
31
+ ![eval](https://cdn-uploads.huggingface.co/production/uploads/67fb3a09be94c007ddfde83a/ivLetBNo7F_7G5hjT3Qx9.png)