Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
- 论文标题:Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
- 最初发表时间:2023.10
- arxiv:https://arxiv.org/abs/2310.01320
- GitHub:https://github.com/Shenzhi-Wang/recon
- 网站:https://shenzhi-wang.github.io/avalon_recon/
- ICLR 2024 撤稿,不知道现在发表在哪里,或许是一些我不太熟悉的 nlp 会议… 不过这篇工作知名度应该蛮高的;询问别人关于 LLM agent play games 的文章时,第一反应似乎都是这篇工作)
- Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
- 01 main idea
- 02 搬运了原文中的 prompt
← 返回目录
01 main idea
这是一篇纯 prompt 工作。
- 核心创新点:
- 引入了两阶沉思(ReCon,Recursive Contemplation)的 prompt:
- 一阶沉思推断别人的想法,二阶沉思则推断,如果我说了某些内容,别人会如何反应。
- 定义了多个评价指标,并使用 GPT-4 根据这些指标进行评估:
- 隐蔽性:LLM 是否暴露了自己的角色;
- 逻辑性:LLM 的表达是否自洽;
- 贡献度:对团队的影响力;
- 说服力:影响他人决策的能力;
- 信息量:信息传递的效率;
- 创造性。
- 实验:使用的模型是 ChatGPT 和 Claude。在实验中,好人方和坏人方分别装配 ReCon,而另一方则使用普通的 prompt。
- 个人感觉,这项工作为类似狼人杀、阿瓦隆这类 需要隐藏身份和欺骗的游戏,提供了一个有效的 prompt 模板。在这篇工作之后,关于 LLM agent 玩此类游戏的研究都需要使用类似的 prompt。无端联想,可能在这篇工作之后,单纯依赖 prompt 玩这种文字游戏的工作将失去意义,如果想发表,还需要其他的神秘创新点(例如后续的狼人杀研究)。
02 搬运了原文中的 prompt
Appendix E 里有很多 Prompt Templates(论文说是原始 prompt 的精简版),搬运如下。
让 agent 开始沉思:
Respond in two stages: THINK and SPEAK
In think, internally strategize using history and consider possible deception.
In speak, organize your language based on your contemplation and speak accordingly.
Understand your role's main objective and break it down into chronological sub-goals based on game history. Your thought process should followthese sub-goals for a systematic approach to the main goal.
一阶沉思:
You're Player [id] with role [role]. Current situation: [current situation].
Your task is to:
Analyze [other players] based on game dialogues with roles: Merlin, Percival, Loyal Servant of Arthur, Morgana, Assassin. Morgana and Assassin are evil; others are good.
Consider:
- Quest Outcomes: Take into account the results of past missions to analyze players' roles.
- Role List: Remember the possible roles in the game — Merlin, Percival, twoLoyal Servants, Morgana,Assassin — and their alignments.
- Level of Certainty: Use 'Certain' or 'Unknown' to gauge your confidence inyour role guesses for each player.
- Players Disclosing Evil Roles: Be cautious around players who have openly claimedor hinted at being evil roles like Morgana orAssassin.
- Prior Guesses: Reflect on your earlier estimations of other players' roles ([previous attitude to players]), but don't rely solely on them.
二阶沉思:
You're Player [id] with role [role]. Current situation: [current situation].
Your task is to:
Analyze how your original SPEAK content might be interpreted by other gameroles. Reflect on whether it may inadvertently reveal your role-specific clues.
Consider:
- The perspectives of each game role, including their probable reactions toyour SPEAK content.
- Any unique hints or clues in your original SPEAK that might disclose your role.
生成回复:
You're observing Player [id] with role [role]. Current situation: [current situation].
Your task is to:
- Evaluate if Player [id]'s actions align with [role].
- Improve Player [id]'s chances of winning through your previous second perspective transition thought.
- Keep role hint in public dialogue.
Consider:
- Target Outcome: Aim to achieve [desired result] as your role dictates in the game.
- RoleAlignment: Evaluate whether your THINK and SPEAK contents align well with your role [role] in the current game state.
- Strategy Reevaluation: Consider what changes could be made to your THINK and SPEAK contents to improve your chances of winning as [role].
- Public and Private Content: Remember that THINK contents are private, while SPEAK contents are publicly visible. Strategize accordingly.
← 返回目录