人们对于通用人工智能(AGI)的追求可以追溯到1950 年代中期,当时的AI研究者对机器拥有人类思维能力抱有很高的期望,但是随着研究的深入,他们发现想实现这个目标比最初设想的困难许多。到如今,AGI仍然有很长的路要走。
不过值得高兴的是,在今年的各大顶会中,有关自主智能体的研究有了许多突破性进展,以往困扰AI Agent研究者的社会交互性和智能性问题都随着大语言模型(LLM)的发展有了新的解决方向。
为方便大家了解AI Agent领域的最新研究进展,我这回整理了52篇2023最新大模型智能体相关的论文,包括LLM-based Agent 的构建、应用、评估等方面。
1.A Survey on Large Language Model-based Autonomous Agents
2.The Rise and Potential of Large Language ModelBased Agents: A Survey
1.CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society
2.Agent Instructs Large Language Models to be General Zero-Shot Reasoners
3.Reflexion: language agents with verbal reinforcement learning
简述:这篇论文提出了一种名为 Reflexion 的新框架,通过语言反馈而不是权重更新来增强语言代理,代理会对任务反馈进行口头反思并记录在记忆中,以诱导后续试验中的更好决策。该框架在各种任务上取得明显优于基准的效果,为语言代理提供了一种快速高效的试错学习机制。
4.AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
5.Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph
6.SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks.
7.Tree of Thoughts: Deliberate Problem Solving with Large Language Models.
8.AVIS: Autonomous Visual Information Seeking with Large Language Models
9.Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
10.Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
11.Learning Distributed Representations of Sentences from Unlabelled Data
12.A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
13.HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
14.Large Language Models as Tool Makers
15.InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
16.AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
17.InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
18.PandaGPT: One Model To Instruction-Follow Them All
19.Visual Instruction Tuning
20.MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
21.LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
22.Agents: An Open-source Framework for Autonomous Language Agents
1.WebArena: A Realistic Web Environment for Building Autonomous Agents
2.3D-LLM: Injecting the 3D World into Large Language Models
3.InterAct: Exploring the Potentials of ChatGPT as a Cooperative Agent
4.The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models
5.Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
6.SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models
7.ChatLLM Network: More brains, More intelligence
8.ProAgent: Building Proactive Cooperative AI with Large Language Models
9.MetaGPT: Meta Programming for Multi-Agent Collaborative Framework
10.ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
11.A Virtual Conversational Agent for Teens with Autism Spectrum Disorder: Experimental Results and Design Lessons
12.Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
13.Multi-Turn Dialogue Agent as Sales' Assistant in Telemarketing
14.Agents: An Open-source Framework for Autonomous Language Agents
15.Improving Factuality and Reasoning in Language Models through Multiagent Debate
16.Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
17.Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents
18.RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
19.Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks
20.ChatMOF: An Autonomous AI System for Predicting and Generating Metal-Organic Frameworks
21.WebGPT: Browser-assisted question-answering with human feedback
22.Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents
23.Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
24.ScienceWorld: Is your Agent Smarter than a 5th Grader?
25.CGMI: Configurable General Multi-Agent Interaction Framework
26.SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
1.Evaluating Cognitive Maps and Planning in Large Language Models with CogEval
2.On the Planning Abilities of Large Language Models