euclaise

https://euclaise.xyz

euclaise

AI & ML interests

None yet

Recent Activity

liked a dataset about 3 hours ago

TuringEnterprises/Open-RL

liked a dataset about 3 hours ago

nvidia/Nemotron-Math-HumanReasoning

liked a dataset about 4 hours ago

allenai/tulu-3-sft-mixture

View all activity

Organizations

upvoted 2 papers about 5 hours ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published 6 days ago • 47

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

Paper • 2603.10145 • Published 5 days ago • 6

upvoted 3 papers 14 days ago

upvoted a paper 17 days ago

On the "Induction Bias" in Sequence Models

Paper • 2602.18333 • Published 23 days ago • 4

upvoted 2 papers 18 days ago

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published 19 days ago • 5

One-step Language Modeling via Continuous Denoising

Paper • 2602.16813 • Published 24 days ago • 4

upvoted an article 20 days ago

Article

Differential Transformer V2

Jan 20

•

upvoted a paper 22 days ago

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy

Paper • 2602.17363 • Published 24 days ago • 8

upvoted 3 papers 23 days ago

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published 30 days ago • 31

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Paper • 2602.15322 • Published 26 days ago • 9

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published 29 days ago • 43

upvoted 4 papers 29 days ago

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

Paper • 2510.00526 • Published Oct 1, 2025 • 10

Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning

Paper • 2602.08382 • Published Feb 9 • 11

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 29

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published about 1 month ago • 91

upvoted 3 papers about 1 month ago

Prism: Spectral-Aware Block-Sparse Attention

Paper • 2602.08426 • Published Feb 9 • 36

iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published Feb 9 • 17

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 41

euclaise

AI & ML interests

Recent Activity

Organizations

euclaise's activity

Differential Transformer V2