Non-linear Monte-Carlo search in civilization II

Satchuthanan R Branavan,David Silver,Regina Barzilay

IJCAI（2011）

引用 45|浏览4

暂无评分

摘要

This paper presents a new Monte-Carlo search algorithm for very large sequential decision-making problems. We apply non-linear regression within Monte-Carlo search, online, to estimate a state-action value function from the outcomes of random roll-outs. This value function generalizes between related states and actions, and can therefore provide more accurate evaluations after fewer rollouts. A further significant advantage of this approach is its ability to automatically extract and leverage domain knowledge from external sources such as game manuals. We apply our algorithm to the game of Civilization II, a challenging multiagent strategy game with an enormous state space and around 1021 joint actions. We approximate the value function by a neural network, augmented by linguistic knowledge that is extracted automatically from the official game manual. We show that this non-linear value function is significantly more efficient than a linear value function, which is itself more efficient than Monte-Carlo tree search. Our non-linear Monte-Carlo search wins over 78% of games against the built-in AI of Civilization II.

查看译文

关键词

non-linear monte-carlo search,linear value function,monte-carlo search,civilization ii,challenging multiagent strategy game,value function,non-linear value function,game manual,value function generalizes,monte-carlo tree search,state-action value function

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要