james curry's picture

james curry

ainbo

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

upvoted a paper 6 days ago

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

upvoted a paper 20 days ago

AI Can Learn Scientific Taste

View all activity

Organizations

upvoted a paper 3 days ago

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Paper • 2604.11784 • Published 7 days ago • 137

upvoted a paper 6 days ago

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Paper • 2604.11804 • Published 7 days ago • 69

upvoted 2 papers 20 days ago

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 424

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Paper • 2603.28767 • Published 20 days ago • 57

upvoted 3 papers about 1 month ago

Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Paper • 2603.05890 • Published Mar 6 • 93

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 58

HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising

Paper • 2603.08703 • Published Mar 9 • 32

upvoted 6 papers 3 months ago

Causal World Modeling for Robot Control

Paper • 2601.21998 • Published Jan 29 • 31

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Paper • 2601.16973 • Published Jan 23 • 40

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 36

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published Jan 8 • 170

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published Jan 4 • 46

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176

upvoted 6 papers 4 months ago

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 222

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Paper • 2512.17909 • Published Dec 19, 2025 • 37

Kling-Omni Technical Report

Paper • 2512.16776 • Published Dec 18, 2025 • 173

Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Paper • 2512.13507 • Published Dec 15, 2025 • 41

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Paper • 2512.15745 • Published Dec 10, 2025 • 88

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published Dec 18, 2025 • 89

upvoted a paper 5 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 161