Wei Liu's picture

Wei Liu

PeterV09

·

https://vpeterv.github.io

AI & ML interests

Machine Learning, Natural Language Processing

Recent Activity

liked a model 17 days ago

moonshotai/Kimi-K2.6

upvoted a paper about 2 months ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

upvoted a paper about 2 months ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

View all activity

Organizations

upvoted 4 papers about 2 months ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 139

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 311

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Paper • 2512.13607 • Published Dec 15, 2025 • 38

upvoted a paper 2 months ago

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published Feb 26 • 45

upvoted 2 papers 3 months ago

Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems

Paper • 2602.08847 • Published Feb 9 • 29

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

Paper • 2602.07962 • Published Feb 8 • 24

upvoted a collection 3 months ago

Dr.Kernel

8 items • Updated Feb 6 • 4

upvoted 2 papers 3 months ago

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published Feb 5 • 28

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 43

upvoted 3 papers 4 months ago

Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics

Paper • 2601.14027 • Published Jan 20 • 13

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 230

SWE-RM: Execution-free Feedback For Software Engineering Agents

Paper • 2512.21919 • Published Dec 26, 2025 • 10

upvoted a paper 5 months ago

Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning

Paper • 2511.20549 • Published Nov 25, 2025 • 27

upvoted 6 papers 6 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models

Paper • 2511.13704 • Published Nov 17, 2025 • 44

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published Nov 11, 2025 • 36

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30, 2025 • 87

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 73