Zhiyuan Ning's picture

30

Zhiyuan Ning

nzynzy

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 18 hours ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

upvoted a paper about 18 hours ago

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

upvoted a paper about 18 hours ago

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

View all activity

Organizations

None yet

nzynzy's activity

upvoted 3 papers about 18 hours ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 4 days ago • 124

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Paper • 2502.08235 • Published 9 days ago • 50

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published 10 days ago • 40

upvoted 3 papers 6 days ago

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published 10 days ago • 18

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 10 days ago • 132

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published 10 days ago • 31

upvoted 4 papers 9 days ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 13 days ago • 112

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published 15 days ago • 56

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 15 days ago • 51

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published 10 days ago • 84

upvoted 2 papers 16 days ago

Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published 28 days ago • 51

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 26 days ago • 61

upvoted a paper 17 days ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published 17 days ago • 54

upvoted 7 papers 18 days ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 106

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 91

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 22 days ago • 55

s1: Simple test-time scaling

Paper • 2501.19393 • Published 20 days ago • 105

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 51

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 59

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78