soldni commited on
Commit
d0262c9
1 Parent(s): f75a7cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -3
README.md CHANGED
@@ -1,3 +1,60 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Molmo 7B-D
6
+
7
+ ## Quick Start
8
+
9
+ To run Molmo, first install dependencies:
10
+
11
+ ```bash
12
+ pip install einops torch torchvision PIL
13
+ ```
14
+
15
+ Then, follow these steps:
16
+
17
+ ```python
18
+ from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
19
+ from PIL import Image
20
+ import requests
21
+
22
+ # load the processor
23
+ processor = AutoProcessor.from_pretrained(
24
+ 'allenai/Molmo-7B-D-0924',
25
+ trust_remote_code=True,
26
+ torch_dtype='auto',
27
+ device_map='auto'
28
+ )
29
+
30
+ # load the model
31
+ model = AutoModelForCausalLM.from_pretrained(
32
+ 'allenai/Molmo-7B-D-0924',
33
+ trust_remote_code=True,
34
+ torch_dtype='auto',
35
+ device_map='auto'
36
+ )
37
+
38
+ # process the image and text
39
+ inputs = processor.process(
40
+ images=[Image.open(requests.get("https://picsum.photos/id/237/536/354", stream=True).raw)],
41
+ text="Describe this image."
42
+ )
43
+
44
+ # move inputs to the correct device and make a batch of size 1
45
+ inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}
46
+
47
+ # generate output; maximum 200 new tokens; stop generation when <|endoftext|> is generated
48
+ output = model.generate_from_batch(
49
+ inputs,
50
+ GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
51
+ tokenizer=processor.tokenizer
52
+ )
53
+
54
+ # only get generated tokens; decode them to text
55
+ generated_tokens = output[0,inputs['input_ids'].size(1):]
56
+ generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)
57
+
58
+ # print the generated text
59
+ print(generated_text)
60
+ ```