Aaron Cummings

Aaron-Cu

AI & ML interests

PhD Student CS | Artificial Intelligence Security and Optimization Graduate Research Assistant @ Kennesaw State University | MSCS | BSCS

Recent Activity

liked a dataset about 1 month ago

rajpurkar/squad

upvoted a paper 3 months ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

upvoted a paper 3 months ago

Reinforcement Pre-Training

View all activity

Organizations

liked a dataset about 1 month ago

rajpurkar/squad

Viewer • Updated Mar 4, 2024 • 98.2k • 117k • 356

upvoted 5 papers 3 months ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31, 2025 • 305

upvoted a paper 7 months ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 141

liked a dataset 7 months ago

arxiv-community/arxiv_dataset

Updated Jan 18, 2024 • 513 • 134

upvoted a collection 8 months ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 305

upvoted 3 papers 9 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17, 2025 • 50

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 19

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

upvoted an article 9 months ago

Article

Preference Tuning LLMs with Direct Preference Optimization Methods

Jan 18, 2024

•

liked a dataset 9 months ago

Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 24.3k • 1.68k

upvoted 2 articles 9 months ago

Article

How to train a new language model from scratch using Transformers and Tokenizers

Feb 14, 2020

•

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

405

liked 4 datasets about 1 year ago

shareAI/ShareGPT-Chinese-English-90k

Preview • Updated Dec 29, 2025 • 997 • 279

RyokoAI/ShareGPT52K

Preview • Updated Apr 2, 2023 • 911 • 355

fka/prompts.chat

Viewer • Updated about 5 hours ago • 1.48k • 25.3k • 9.62k

anon8231489123/ShareGPT_Vicuna_unfiltered

Updated Apr 12, 2023 • 134k • 850

Aaron Cummings

AI & ML interests

Recent Activity

Organizations

Aaron-Cu's activity

Preference Tuning LLMs with Direct Preference Optimization Methods

How to train a new language model from scratch using Transformers and Tokenizers

Illustrating Reinforcement Learning from Human Feedback (RLHF)