Qian Liu's picture

Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Recent Activity

authored a paper about 18 hours ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

upvoted a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

commented on a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

View all activity

Organizations

upvoted a paper 1 day ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published 4 days ago • 6

upvoted a paper 5 days ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published 5 days ago • 39

upvoted a paper 8 days ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published 10 days ago • 42

upvoted a paper 9 days ago

MMSearch-R1: Incentivizing LMMs to Search

Paper • 2506.20670 • Published 10 days ago • 57

upvoted 2 collections 18 days ago

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 3 days ago • 106

AceReason

Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 3 days ago • 13

upvoted 2 papers 18 days ago

TaskCraft: Automated Generation of Agentic Tasks

Paper • 2506.10055 • Published 24 days ago • 31

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published 19 days ago • 249

upvoted an article 24 days ago

Article

GRPO for GUI Grounding Done Right

By

•

25 days ago

• 29

upvoted a collection about 1 month ago

Qwen3

72 items • Updated 20 days ago • 825

upvoted 4 papers about 1 month ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 165

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 132

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 31

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21 • 33

upvoted 4 papers about 2 months ago

General-Reasoner: Advancing LLM Reasoning Across All Domains

Paper • 2505.14652 • Published May 20 • 22

Group-in-Group Policy Optimization for LLM Agent Training

Paper • 2505.10978 • Published May 16 • 8

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 209

Parallel Scaling Law for Language Models

Paper • 2505.10475 • Published May 15 • 81

upvoted a paper 2 months ago

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Paper • 2505.02735 • Published May 5 • 31

upvoted a paper 3 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92