Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs
Abstract
MICS, a novel reasoning-path searching scheme, enhances medical MLLMs like Chiron-o1 with robust generalizable reasoning and visual question-answering capabilities through comprehensive chain-of-thought data generation.
Multimodal large language models (MLLMs) have begun to demonstrate robust reasoning capabilities on general tasks, yet their application in the medical domain remains in its early stages. Constructing chain-of-thought (CoT) training data is essential for bolstering the reasoning abilities of medical MLLMs. However, existing approaches exhibit a deficiency in offering a comprehensive framework for searching and evaluating effective reasoning paths towards critical diagnosis. To address this challenge, we propose Mentor-Intern Collaborative Search (MICS), a novel reasoning-path searching scheme to generate rigorous and effective medical CoT data. MICS first leverages mentor models to initialize the reasoning, one step at a time, then prompts each intern model to continue the thinking along those initiated paths, and finally selects the optimal reasoning path according to the overall reasoning performance of multiple intern models. The reasoning performance is determined by an MICS-Score, which assesses the quality of generated reasoning paths. Eventually, we construct MMRP, a multi-task medical reasoning dataset with ranked difficulty, and Chiron-o1, a new medical MLLM devised via a curriculum learning strategy, with robust visual question-answering and generalizable reasoning capabilities. Extensive experiments demonstrate that Chiron-o1, trained on our CoT dataset constructed using MICS, achieves state-of-the-art performance across a list of medical visual question answering and reasoning benchmarks. Codes are available at GitHub - manglu097/Chiron-o1: Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs
Community
MICS: A novel reasoning path search method for generating high-quality, verifiable medical chain-of-thought data. The authors trained a medical multimodal large language model, Chiron-o1, based on MICS and combined with a novel curriculum learning strategy. Experiments show that Chiron-o1 achieves SOTA performance on multiple medical visual question answering and reasoning benchmarks. The code has just been open-sourced!
GitHub:https://github.com/manglu097/Chiron-o1
😊:https://huggingface.co/manglu3935/Chiron-o1-8B
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making (2025)
- Infi-Med: Low-Resource Medical MLLMs with Robust Reasoning Evaluation (2025)
- ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning (2025)
- Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning (2025)
- Elicit and Enhance: Advancing Multimodal Reasoning in Medical Scenarios (2025)
- Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs (2025)
- Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper