Makise Kurisu's picture

Makise Kurisu

kurisu0306

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

upvoted a paper 22 days ago

T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning

upvoted a paper 22 days ago

Flow-OPD: On-Policy Distillation for Flow Matching Models

View all activity

Organizations

None yet

upvoted a paper 6 days ago

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

Paper • 2605.26494 • Published 7 days ago • 38

upvoted 2 papers 22 days ago

T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning

Paper • 2605.02178 • Published 29 days ago • 10

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published 25 days ago • 98

upvoted a collection 24 days ago

Qwen3.5

21 items • Updated Mar 9 • 1.66k

upvoted a collection about 1 month ago

DeepSeek-V4

4 items • Updated Apr 24 • 663

upvoted a paper about 1 month ago

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published Apr 20 • 85

upvoted a paper about 2 months ago

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

upvoted a paper 2 months ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published Mar 27 • 365

upvoted 3 papers 3 months ago

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published Feb 13 • 36

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

Paper • 2603.12201 • Published Mar 12 • 53

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266

upvoted 8 papers 4 months ago

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published Feb 4 • 100

daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

Paper • 2602.02619 • Published Feb 2 • 53

CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

Paper • 2602.03048 • Published Feb 3 • 32

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

Paper • 2602.02486 • Published Feb 2 • 20

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 273

ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas

Paper • 2601.21558 • Published Jan 29 • 61

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 181

Not All Correct Answers Are Equal: Why Your Distillation Source Matters

Paper • 2505.14464 • Published May 20, 2025 • 10

upvoted a collection 4 months ago

ASTRA Dataset

2 items • Updated Jan 21 • 5