PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published 23 days ago • 53
GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI Paper • 2409.01392 • Published Sep 2 • 9
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published May 28 • 18
Specialized Language Models with Cheap Inference from Limited Domain Data Paper • 2402.01093 • Published Feb 2 • 45
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2 • 33
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models Paper • 2402.01118 • Published Feb 2 • 29
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24 • 44
MM-VID: Advancing Video Understanding with GPT-4V(ision) Paper • 2310.19773 • Published Oct 30, 2023 • 19
A Zero-Shot Language Agent for Computer Control with Structured Reflection Paper • 2310.08740 • Published Oct 12, 2023 • 14
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest Paper • 2307.03601 • Published Jul 7, 2023 • 11