18 22 60

Alexander Kovrigin

waleko

https://alexkovrigin.me

waleko

AI & ML interests

AI for Code

Recent Activity

liked a Space 3 days ago

duoan/TorchCode

updated a dataset 23 days ago

waleko/Dolci-RL-Zero-Math-7B-solved

published a dataset 23 days ago

waleko/Dolci-RL-Zero-Math-7B-solved

View all activity

Organizations

upvoted a paper about 2 months ago

PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 221

upvoted a paper 4 months ago

Adversarial Flow Models

Paper • 2511.22475 • Published Nov 27, 2025 • 24

upvoted 2 papers 5 months ago

The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation

Paper • 2510.23393 • Published Oct 27, 2025 • 21

The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management

Paper • 2508.21433 • Published Aug 29, 2025 • 7

upvoted a collection 6 months ago

🦫 PIPer

Collection

All the resources for our paper "PIPer: On-Device Environment Setup via Online Reinforcement Learning"! • 9 items • Updated Oct 1, 2025 • 3

upvoted 2 papers 6 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 91

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Paper • 2509.25455 • Published Sep 29, 2025 • 38

upvoted 2 papers 9 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 51

ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models

Paper • 2505.22569 • Published May 28, 2025 • 55

upvoted 4 papers 10 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26, 2025 • 95

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 132

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 98

upvoted an article about 1 year ago

Article

Open R1: Update #3

Mar 11, 2025

•

297

upvoted a paper about 1 year ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20, 2025 • 195

upvoted an article about 1 year ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4, 2025

•

1.32k

upvoted an article over 1 year ago

Article

Little Paper Reviews & AutoCodeRover

Oct 2, 2024

•

upvoted 3 papers almost 2 years ago

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 69

Long Code Arena: a Set of Benchmarks for Long-Context Code Models

Paper • 2406.11612 • Published Jun 17, 2024 • 25

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Paper • 2406.08973 • Published Jun 13, 2024 • 89

Alexander Kovrigin

AI & ML interests

Recent Activity

Organizations

waleko's activity

Open R1: Update #3

Open-source DeepResearch – Freeing our search agents

Little Paper Reviews & AutoCodeRover