vikhyatk commited on
Commit
6e1575b
·
verified ·
1 Parent(s): c19ca21

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -7,7 +7,9 @@ Moondream is a small vision language model designed to run efficiently everywher
7
 
8
  [Website](https://moondream.ai/) / [Demo](https://moondream.ai/playground) / [GitHub](https://github.com/vikhyat/moondream)
9
 
10
- This repository contains the 2025-04-14 **int4** release of Moondream. There's more information about this version of the model in our [release blog post](https://moondream.ai/blog/smaller-faster-moondream-with-qat). Other revisions, as well as release history, can be found [here](https://huggingface.co/vikhyatk/moondream2).
 
 
11
 
12
  ### Usage
13
 
@@ -27,6 +29,11 @@ model = AutoModelForCausalLM.from_pretrained(
27
  device_map={"": "cuda"}
28
  )
29
 
 
 
 
 
 
30
  # Captioning
31
  print("Short caption:")
32
  print(model.caption(image, length="short")["caption"])
 
7
 
8
  [Website](https://moondream.ai/) / [Demo](https://moondream.ai/playground) / [GitHub](https://github.com/vikhyat/moondream)
9
 
10
+ This repository contains the 2025-04-14 **int4** release of Moondream. On an RTX 3090, it uses 2,305 MB of VRAM and runs at a speed of 187 tokens/second.
11
+
12
+ There's more information about this version of the model in our [release blog post](https://moondream.ai/blog/smaller-faster-moondream-with-qat). Other revisions, as well as release history, can be found [here](https://huggingface.co/vikhyatk/moondream2).
13
 
14
  ### Usage
15
 
 
29
  device_map={"": "cuda"}
30
  )
31
 
32
+ # Optional, but recommended when running inference on a large number of
33
+ # images since it has upfront compilation cost but significantly speeds
34
+ # up inference:
35
+ model.model.compile()
36
+
37
  # Captioning
38
  print("Short caption:")
39
  print(model.caption(image, length="short")["caption"])