Tsinghua University

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

MasterVito authored a paper 3 days ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

MasterVito authored a paper 8 days ago

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

MasterVito authored a paper 10 days ago

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

View all activity

MasterVito

authored a paper 3 days ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Paper • 2506.02678 • Published 23 days ago • 5

MasterVito

authored a paper 8 days ago

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published 9 days ago • 35

MasterVito

authored a paper 10 days ago

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Paper • 2506.08989 • Published 16 days ago • 14

HuanjinYao

authored a paper 29 days ago

R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO

Paper • 2505.16673 • Published May 22 • 2

lixiaochuan2020

authored 2 papers about 1 month ago

Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning

Paper • 2410.14208 • Published Oct 18, 2024 • 3

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 45

HuanjinYao

authored a paper 3 months ago

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 29

Ringo1110

authored 2 papers 4 months ago

ProReflow: Progressive Reflow with Decomposed Velocity

Paper • 2503.04824 • Published Mar 5 • 9

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging

Paper • 2503.02783 • Published Mar 4 • 6

Ringo1110

authored 3 papers 6 months ago

Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance

Paper • 2406.15330 • Published Jun 21, 2024

Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training

Paper • 2411.14318 • Published Nov 21, 2024

EpiCoder: Encompassing Diversity and Complexity in Code Generation

Paper • 2501.04694 • Published Jan 8 • 16

zsqzz

authored a paper 6 months ago

Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 38

HuanjinYao

authored a paper 6 months ago

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 40

txiong23

authored a paper 9 months ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 38

zsqzz

authored 2 papers 10 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 67

Efficient LLM Scheduling by Learning to Rank

Paper • 2408.15792 • Published Aug 28, 2024 • 21

HuanjinYao

authored a paper about 1 year ago

Dense Connector for MLLMs

Paper • 2405.13800 • Published May 22, 2024 • 25

lixiaochuan2020

authored 2 papers about 1 year ago

Do Large Language Models Know about Facts?

Paper • 2310.05177 • Published Oct 8, 2023

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 51