Distributionally Robust Optimization with Bias and Variance Reduction Paper • 2310.13863 • Published Oct 21, 2023
The Benefits of Balance: From Information Projections to Variance Reduction Paper • 2408.15065 • Published Aug 27, 2024 • 1
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Paper • 2406.12935 • Published Jun 17, 2024 • 2
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models Paper • 2406.12257 • Published Jun 18, 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 39
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Paper • 2502.12025 • Published Feb 17 • 3
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20 • 13
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL Paper • 2505.23977 • Published May 29 • 10
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement Paper • 2504.07934 • Published Apr 10 • 20
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper • 2503.20672 • Published Mar 26 • 14