Keyu Duan

vermouthdky

·

https://kduan.live

vermouthdky

AI & ML interests

LLM Reasoning and Safety

Recent Activity

liked a model about 1 month ago

MiniMaxAI/MiniMax-M3

authored a paper 4 months ago

In-Context Reinforcement Learning for Tool Use in Large Language Models

upvoted an article 5 months ago

Forge: Scalable Agent RL Framework and Algorithm

View all activity

Organizations

Collections 2

Papers 3

arxiv:2603.08068

arxiv:2510.01051

arxiv:2504.10559

models 6

vermouthdky/llama-3-70_unnatural_instruction_lima

Updated Feb 28, 2025

vermouthdky/llama-3-70_natural_instruction_lima

Updated Feb 28, 2025

vermouthdky/llama-3_unnatural_instruction_lima

Updated Feb 28, 2025

vermouthdky/llama-3_natural_instruction_lima

Updated Feb 28, 2025 • 2

vermouthdky/gemma-2_natural_instruction_lima

Updated Feb 28, 2025

vermouthdky/gemma-2_unnatural_instruction_lima

Updated Feb 28, 2025

datasets 6

vermouthdky/Unnatural_SimGSM8K

Viewer • Updated Feb 24, 2025 • 100 • 22

vermouthdky/Unnatural_SynContextQA

Viewer • Updated Feb 24, 2025 • 200 • 28

vermouthdky/Unnatural_LIMA

Viewer • Updated Feb 24, 2025 • 1k • 19

vermouthdky/prm800k-phase2

Viewer • Updated Dec 19, 2024 • 491k • 28

vermouthdky/prm800k-phase1

Viewer • Updated Dec 19, 2024 • 30.9k • 12

vermouthdky/SimTeG

Updated Jul 12, 2023 • 45 • 1