Be Considerate: Avoiding Negative Side Effects in Reinforcement Learning.

Parand Alizadeh Alamdari,Toryn Q. Klassen,Rodrigo Toro Icarte,Sheila A. McIlraith

International Joint Conference on Autonomous Agents and Multi-agent Systems（2022）

引用 5|浏览33

暂无评分

摘要

In sequential decision making -- whether it's realized with or without the benefit of a model -- objectives are often underspecified or incomplete. This gives discretion to the acting agent to realize the stated objective in ways that may result in undesirable outcomes, including inadvertently creating an unsafe environment or indirectly impacting the agency of humans or other agents that typically operate in the environment. In this paper, we explore how to build a reinforcement learning (RL) agent that contemplates the impact of its actions on the wellbeing and agency of others in the environment, most notably humans. We endow RL agents with the ability to contemplate such impact by augmenting their reward based on expectation of future return by others in the environment, providing different criteria for characterizing impact. We further endow these agents with the ability to differentially factor this impact into their decision making, manifesting behaviour that ranges from self-centred to self-less, as demonstrated by experiments in gridworld environments.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要