ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent Paper • 2312.10003 • Published Dec 15, 2023 • 35
ReAct: Synergizing Reasoning and Acting in Language Models Paper • 2210.03629 • Published Oct 6, 2022 • 14
Reflexion: Language Agents with Verbal Reinforcement Learning Paper • 2303.11366 • Published Mar 20, 2023 • 4
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework Paper • 2308.08155 • Published Aug 16, 2023 • 3
Gorilla: Large Language Model Connected with Massive APIs Paper • 2305.15334 • Published May 24, 2023 • 4
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges Paper • 2401.07339 • Published Jan 14
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement Paper • 2402.07456 • Published Feb 12 • 41
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 73
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis Paper • 2307.12856 • Published Jul 24, 2023 • 35
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents Paper • 2403.08715 • Published Mar 13 • 20
Emergent Agentic Transformer from Chain of Hindsight Experience Paper • 2305.16554 • Published May 26, 2023
Improving Agent Interactions in Virtual Environments with Language Models Paper • 2402.05440 • Published Feb 8 • 1
TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents Paper • 2308.03427 • Published Aug 7, 2023 • 14
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction Paper • 2401.06201 • Published Jan 11 • 2
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving Paper • 2309.17452 • Published Sep 29, 2023 • 3
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 45
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents Paper • 2310.09343 • Published Oct 13, 2023 • 2
Evaluating Very Long-Term Conversational Memory of LLM Agents Paper • 2402.17753 • Published Feb 27 • 18
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning Paper • 2312.14878 • Published Dec 22, 2023 • 13
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency Paper • 2309.17382 • Published Sep 29, 2023 • 4
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 109
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning Paper • 2401.10480 • Published Jan 19
Inferring the Goals of Communicating Agents from Actions and Instructions Paper • 2306.16207 • Published Jun 28, 2023
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation Paper • 2402.10178 • Published Feb 15
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 74
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration Paper • 2402.11550 • Published Feb 18 • 15
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3 • 47
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 104
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Paper • 2404.02893 • Published Apr 3 • 20
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 60
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 24
Best Practices and Lessons Learned on Synthetic Data for Language Models Paper • 2404.07503 • Published Apr 11 • 29
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published Apr 11 • 47
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published Apr 11 • 44
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18 • 53
CodeCoT and Beyond: Learning to Program and Test like a Developer Paper • 2308.08784 • Published Aug 17, 2023 • 5
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 30
SCREWS: A Modular Framework for Reasoning with Revisions Paper • 2309.13075 • Published Sep 20, 2023 • 15
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Paper • 2310.15123 • Published Oct 23, 2023 • 7
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search Paper • 2310.13227 • Published Oct 20, 2023 • 12
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models Paper • 2310.04406 • Published Oct 6, 2023 • 8
Autonomous Tree-search Ability of Large Language Models Paper • 2310.10686 • Published Oct 14, 2023 • 2
Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning Paper • 2310.04474 • Published Oct 6, 2023 • 2
AgentTuning: Enabling Generalized Agent Abilities for LLMs Paper • 2310.12823 • Published Oct 19, 2023 • 35
A Zero-Shot Language Agent for Computer Control with Structured Reflection Paper • 2310.08740 • Published Oct 12, 2023 • 14
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn Paper • 2306.08640 • Published Jun 14, 2023 • 26
Diversity of Thought Improves Reasoning Abilities of Large Language Models Paper • 2310.07088 • Published Oct 11, 2023 • 5
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published Apr 19 • 41
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 68
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations Paper • 2404.17521 • Published Apr 26 • 12
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 118
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale Paper • 2409.16299 • Published Sep 9 • 9