Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 108
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 2
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published May 28 • 6
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 2
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published May 28 • 6
XReasoning - models Collection https://arxiv.org/abs/2505.22888 ds - means continue post-training on deepseek distilled qwen math 7b limo-{language}-{amount of data} • 19 items • Updated Jun 4 • 1