Compatibility with olmOCR repo

#2
by pszemraj - opened

Great work! Since you mention this is a "drop in replacement", can I "drop it in" to https://github.com/allenai/olmocr with the --model arg for python -m olmocr.pipeline? Figured I'd ask before trying as you mention changes to the amounts of metadata it wants to see, etc.

edit: I know you provide an example with vLLM but this would require rebuilding olmocr.pipeline to have a CLI script I can point at a directory of PDF files

Great work! Since you mention this is a "drop in replacement", can I "drop it in" to https://github.com/allenai/olmocr with the --model arg for python -m olmocr.pipeline? Figured I'd ask before trying as you mention changes to the amounts of metadata it wants to see, etc.

edit: I know you provide an example with vLLM but this would require rebuilding olmocr.pipeline to have a CLI script I can point at a directory of PDF files

Hi @pszemraj , the model should mostly be compatible with olmocr pipeline, but with some tweaks: the prompt is different (you might want to modify this: https://github.com/allenai/olmocr/blob/main/olmocr/prompts/prompts.py), and the model arch is now Qwen2.5-vl instead of Qwen2.0-vl. The rest of it should be the same.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment