GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published 15 days ago • 184
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published 15 days ago • 50
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published about 1 month ago • 91
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following Paper • 2506.12285 • Published Jun 14 • 54
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published about 1 month ago • 253
Audio-Aware Large Language Models as Judges for Speaking Styles Paper • 2506.05984 • Published Jun 6 • 15
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents Paper • 2505.23923 • Published May 29 • 7
Reverse Preference Optimization for Complex Instruction Following Paper • 2505.22172 • Published May 28 • 6
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 409
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 38
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Paper • 2410.22821 • Published Oct 30, 2024 • 2
Iterative Forward Tuning Boosts In-Context Learning in Language Models Paper • 2305.13016 • Published May 22, 2023 • 1
CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment Paper • 2310.16271 • Published Oct 25, 2023 • 1
From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News Paper • 2403.09498 • Published Mar 14, 2024 • 1
Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer Paper • 2403.19979 • Published Mar 29, 2024 • 1