PixelReasoner-RL-v1 / README.md
JasperHaozhe's picture
Update README.md
477cc28 verified
|
raw
history blame
549 Bytes
metadata
license: apache-2.0
datasets:
  - TIGER-Lab/PixelReasoner-SFT-Data
language:
  - en
metrics:
  - accuracy
base_model:
  - Qwen/Qwen2.5-VL-7B-Instruct
pipeline_tag: question-answering

The model is trained with curiosity-driven RL described in paper.

We have released vllm based inference code at https://github.com/TIGER-AI-Lab/Pixel-Reasoner/.

We will release a simple hf.generate() based inference code.

Please also play with the cool interactive demo