In settings without well-defined goals, methods for reward learning
allow reinforcement learning agents to infer goals from human
feedback. Existing work has discussed the problem that such agents
may manipulate humans, or the reward learning process, in order
to gain higher reward. We introduce the neglected problem that, in
multi-agent settings, agents may have incentives to manipulate one
another’s reward functions in order to change each other’s behav-
ioral policies. We focus on the setting with humans acting alongside
assistive (artificial) agents who must learn the reward function by
interacting with these humans. We propose a possible solution to
manipulation of human feedback in this setting: the Shared Value
Prior (SVP). The SVP equips agents with an assumption that the
reward functions of all humans are similar. Given this assumption,
the actions of any human provide information to an agent about
its reward, and so the agent is incentivised to observe these actions
rather than to manipulate them. We present an expository example
in which the SVP prevents manipulation.

On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios

Poster & Demo Session PD2C

poster

AAMAS (International Conference on Autonomous Agents and Multiagent Systems) is the largest and most influential conference in the area of agents and multiagent systems. The aim of the conference is to bring together researchers and practitioners in all areas of agent technology and to provide a single, high-profile, internationally renowned forum for research in the theory and practice of autonomous agents and multiagent systems. AAMAS is the flagship conference of the non-profit International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).

The AAMAS conference series was initiated in 2002 in Bologna, Italy as a joint event comprising the 6th International Conference on Autonomous Agents (AA), the 5th International Conference on Multiagent Systems (ICMAS), and the 9th International Workshop on Agent Theories, Architectures, and Languages (ATAL).

Subsequent AAMAS conferences have been held in Melbourne, Australia (July 2003), New York City, NY, USA (July 2004), Utrecht, The Netherlands (July 2005), Hakodate, Japan (May 2006), Honolulu, Hawaii, USA (May 2007), Estoril, Portugal (May 2008), Budapest, Hungary (May 2009), Toronto, Canada (May 2010), Taipei, Taiwan (May 2011), Valencia, Spain (June 2012), Minnesota, USA (May 2013), Paris, France (May 2014), Istanbul, Turkey (May 2015), Singapore (May 2016), São Paulo (2017) and Stockholm, Sweden (2018), Montreal (May 2019), Auckland (May 2020, Virtual), London (May 2021, Virtual).
<br>
<br>



A ticket is required to attend this event, please register using the link below:

https://aamas2022-conference.auckland.ac.nz/attending/registration/

Registration Is Required

AAMAS 2022

AAMAS (International Conference on Autonomous Agents and Multiagent Systems) is the largest and most influential conference in the area of agents and multiagent systems. The aim of the conference is to bring together researchers and practitioners in all areas of agent technology and to provide a single, high-profile, internationally renowned forum for research in the theory and practice of autonomous agents and multiagent systems. AAMAS is the flagship conference of the non-profit International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).

Francis Rhys Ward

1

SHORT BIO

Presentations

On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES