Files changed (1) hide show
  1. handler.py +56 -0
handler.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import AutoModelForCausalLM, AutoTokenizer
2
+ from PIL import Image
3
+ import torch
4
+ from io import BytesIO
5
+ import base64
6
+
7
+ # Initialize the model and tokenizer
8
+ model_id = "vikhyatk/moondream2"
9
+ model = AutoModelForCausalLM.from_pretrained(model_id)
10
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
11
+
12
+ # Check if CUDA (GPU support) is available and then set the device to GPU or CPU
13
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
14
+ model.to(device)
15
+
16
+ def preprocess_image(encoded_image):
17
+ """Decode and preprocess the input image."""
18
+ decoded_image = base64.b64decode(encoded_image)
19
+ img = Image.open(BytesIO(decoded_image)).convert("RGB")
20
+ return img
21
+
22
+ def handler(event, context):
23
+ """Handle the incoming request."""
24
+ try:
25
+ # Extract the base64-encoded image and question from the event
26
+ input_image = event['body']['image']
27
+ question = event['body'].get('question', "move to the red ball")
28
+
29
+ # Preprocess the image
30
+ img = preprocess_image(input_image)
31
+
32
+ # Perform inference
33
+ enc_image = model.encode_image(img).to(device)
34
+ answer = model.answer_question(enc_image, question, tokenizer)
35
+
36
+ # If the output is a tensor, move it back to CPU and convert to list
37
+ if isinstance(answer, torch.Tensor):
38
+ answer = answer.cpu().numpy().tolist()
39
+
40
+ # Create the response
41
+ response = {
42
+ "statusCode": 200,
43
+ "body": {
44
+ "answer": answer
45
+ }
46
+ }
47
+ return response
48
+ except Exception as e:
49
+ # Handle any errors
50
+ response = {
51
+ "statusCode": 500,
52
+ "body": {
53
+ "error": str(e)
54
+ }
55
+ }
56
+ return response