yilong xu

sapphirex

7 15 7

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

upvoted a paper 20 days ago

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

upvoted a paper 22 days ago

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

View all activity

Organizations

upvoted a paper 14 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

Paper • 2606.17682 • Published 17 days ago • 26

upvoted a paper 20 days ago

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

Paper • 2606.13106 • Published 22 days ago • 21

upvoted a paper 22 days ago

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Paper • 2606.11052 • Published 24 days ago • 16

upvoted a paper 28 days ago

MemTrain: Self-Supervised Context Memory Training

Paper • 2606.03197 • Published about 1 month ago • 17

upvoted a paper 30 days ago

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published May 31 • 46

upvoted a paper about 1 month ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Paper • 2605.18703 • Published May 18 • 50

liked a dataset 3 months ago

Mosi-AI/LiveClawbench-trajectories

Viewer • Updated 7 days ago • 7.46k • 185 • 3

upvoted a paper 4 months ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published Feb 4 • 22

updated a dataset 5 months ago

sapphirex/lucene-msmarcov2.1

Updated Feb 8 • 4

published a dataset 5 months ago

sapphirex/lucene-msmarcov2.1

Updated Feb 8 • 4

upvoted a paper 5 months ago

MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling

Paper • 2602.03359 • Published Feb 3 • 10

upvoted a collection 8 months ago

Annotation-Efficient Universal Honesty Alignment

Collection

Official Collections of paper "Annotation-Efficient Universal Honesty Alignment". • 5 items • Updated Oct 21, 2025 • 3

upvoted a paper 8 months ago

Annotation-Efficient Universal Honesty Alignment

Paper • 2510.17509 • Published Oct 20, 2025 • 22

liked a model 9 months ago

lnm1p/search-gen-v-4b

Updated Oct 24, 2025 • 2

liked a dataset 9 months ago

lnm1p/Search-Gen-V

Viewer • Updated Oct 24, 2025 • 46.7k • 18 • 1

liked a model 9 months ago

openbmb/VoxCPM-0.5B

Text-to-Speech • Updated Sep 19, 2025 • 6.78k • 807

liked 3 models 10 months ago

google/embeddinggemma-300m

openbmb/MiniCPM4.1-8B

Text Generation • 8B • Updated Oct 24, 2025 • 88.5k • 391

openbmb/MiniCPM-V-4_5

Image-Text-to-Text • 9B • Updated Mar 10 • 87.6k • 1.09k

updated a dataset 10 months ago

sapphirex/RAVine-logs

Updated Aug 25, 2025 • 3.89k • 1

yilong xu

AI & ML interests

Recent Activity

Organizations

sapphirex's activity