Yu Cheng's picture

13 4

Yu Cheng

ych133

·

https://ych133.github.io/

ych133

AI & ML interests

None yet

Recent Activity

upvoted a collection about 20 hours ago

upvoted a paper about 23 hours ago

Learning to Reason under Off-Policy Guidance

upvoted a paper 4 days ago

Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark

View all activity

Organizations

None yet

ych133's activity

upvoted a collection about 20 hours ago

LUFFY-RL

4 items • Updated about 24 hours ago • 3

upvoted a paper about 23 hours ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published 2 days ago • 58

upvoted a paper 4 days ago

Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark

Paper • 2501.05444 • Published Jan 9 • 1

upvoted a paper 13 days ago

Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published 15 days ago • 80

upvoted 2 papers 2 months ago

MoM: Linear Sequence Modeling with Mixture-of-Memories

Paper • 2502.13685 • Published Feb 19 • 35

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22 • 61

authored 4 papers 3 months ago

Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning

Paper • 2501.15103 • Published Jan 25

From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning

Paper • 2501.11877 • Published Jan 21

Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

Paper • 1804.03287 • Published Apr 10, 2018

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3 • 60

upvoted a paper 3 months ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3 • 60

authored 8 papers 3 months ago

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

Paper • 2410.10863 • Published Oct 7, 2024 • 1

DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs

Paper • 2407.11030 • Published Jul 3, 2024

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Paper • 2411.15708 • Published Nov 24, 2024

RoRA-VLM: Robust Retrieval-Augmented Vision Language Models

Paper • 2410.08876 • Published Oct 11, 2024

Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?

Paper • 2411.03670 • Published Nov 6, 2024

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published Dec 23, 2024 • 44

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

Paper • 2501.03124 • Published Jan 6 • 14

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published Jan 5 • 27