1 13 4

Yifan Zeng

yokey

https://xhmy.github.io/

AI & ML interests

Large Language Model, Agentic AI, Deep Learning

Recent Activity

upvoted a paper about 3 hours ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

upvoted an article 2 days ago

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

upvoted a paper about 1 month ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

View all activity

Organizations

None yet

yokey's activity

upvoted a paper about 3 hours ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published 1 day ago • 38

upvoted an article 2 days ago

Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

•

16 days ago

• 40

upvoted a paper about 1 month ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 61

liked a model about 2 months ago

sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • Updated Oct 14, 2024 • 4.17k • 54

upvoted a paper 2 months ago

Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46

upvoted a paper 3 months ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

New activity in google/gemma-2-9b 4 months ago

RuntimeError: Index put requires the source and destination dtypes match, got BFloat16 for the destination and Float for the source.

#24 opened 8 months ago by

saireddy

upvoted 2 papers 4 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 78

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

Paper • 2410.19609 • Published Oct 25, 2024 • 17

authored a paper 4 months ago

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Paper • 2410.16033 • Published Oct 18, 2024

liked a model 4 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated Oct 25, 2024 • 114k • • 2.02k

commented a paper 4 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4 •

authored 2 papers 4 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4

LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking

Paper • 2406.00231 • Published May 31, 2024

upvoted a paper 4 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4

updated a collection 4 months ago

LLM

Collection

19 items • Updated Oct 17, 2024

liked a model 5 months ago

openai-community/gpt2

Text Generation • Updated Feb 19, 2024 • 16.6M • • 2.6k

upvoted a paper 5 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 137

updated a collection 5 months ago

LLM

Collection

19 items • Updated Oct 17, 2024

updated a collection 6 months ago

LLM

Collection

19 items • Updated Oct 17, 2024