Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty Paper • 2507.16806 • Published 15 days ago • 6
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Paper • 2502.01618 • Published Feb 3 • 10