llama-server does not support images ?

#3
by zoldaten - opened

hi.
i started server with:
llama-server -m allenai_olmOCR-7B-0225-preview-Q4_K_M.gguf --mmproj mmproj-allenai_olmOCR-7B-0225-preview-f16.gguf -c 100000 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 -np 4

sending image with web is OK.

but when i trying with python it fails:

import requests
import base64,sys,os

# Define the API endpoint
url = "http://127.0.0.1:8080/completion"

# Define the payload 
payload = {
    "model": "allenai_olmOCR-7B-0225-preview-Q4_K_M.gguf",
    "prompt": "извлеки полностью текст из картинки. не галлюцинируй.",
    "temperature": 0.7,
    "max_tokens": 500
}

      
#for url     
def image_to_base64_data_url(image_url):
    response = requests.get(image_url)
    if response.status_code == 200:
        print('yes')
        base64_data = base64.b64encode(response.content).decode('ascii')
        return f"data:image/jpeg;base64,{base64_data}"
    else:
        raise Exception(f"Failed to download image from {image_url}")


#url
image_url = 'https://storage.yandexcloud.net/-------/2025/9/4/a3/a3f9d1bb-54c6-4de4-af2b-4fd59d76064f'
data_uri = image_to_base64_data_url(image_url)
#print(data_uri)

payload["image_url"] = {
    "url": data_uri
}

headers = {"Content-Type": "application/json"}
try:
    response = requests.post(url, json=payload, headers=headers)

    # Check if the request was successful
    if response.status_code == 200:
        # Parse the response JSON
        response_data = response.json()
        
        print(response_data)
        # Extract the result from the response
        choices = response_data.get("natural_text", [])
        print(choices)
        #if choices:
        #    result = choices[0].get("text", "")
        #    print("Response:", result)
        #else:
            #print("No choices found in the response.")
    else:
        print(f"Request failed with status code {response.status_code}: {response.text}")
except Exception as e:
    print(f"Error occurred: {e}")

i see response data from model but model does not extract any text from image.

Sign up or log in to comment