DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 52
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published Nov 21 • 58