olmOCR-7B 8-bit Quantized Model

This is the olmOCR-7B model quantized to 8-bit precision.

Usage with SGLang

To use this 8-bit quantized model with SGLang:

python -m sglang.launch_server \
    --model-path /path/to/./olmOCR-7B-8bit \
    --trust-remote-code \
    --port 30000 --host 0.0.0.0

Note: The --trust-remote-code flag is necessary because the model uses a custom tokenizer.

Original Model Information

The original model is a qwen2_vl architecture variant, which is a vision-language model based on Qwen2. The model has been modified for OCR capabilities.

Quantization Information

This model has been quantized to 8-bit precision to reduce memory usage.