technical paper

AAMAS 2020

May 11, 2020

Live on Underline

Work-in-progress: Corrected Self Imitation Learning via Demonstrations

DOI: 10.48448/fgvf-wt42

While reinforcement learning (RL) agents have the remarkable ability to learn by interacting with their environments, this process is often slow and data inefficient. Because environment interaction is typically expensive, many approaches have been studied to speed up RL. One popular method for doing so is to leverage human knowledge via imitation learning (IL), in which a demonstrator provides an example of the desired behavior, and the agent seeks to imitate. In this in-progress work, we propose a new way of integrating IL and deep RL, which we call corrected self imitation learning, where an agent provided with demonstration can learn faster compared to an agent with no demonstration. Our method does not increase the number of environmental interactions compared to a baseline RL method, and works well even in the case when the demonstrator is not an expert. We evaluate our method in the Atari game of Ms. Pac-Man and achieve promising results indicating our method has the potential to speed up deep RL algorithms.



Next from AAMAS 2020

technical paper

Useful Policy Invariant Shaping from Arbitrary Advice

AAMAS 2020

Paniz Behboudian and 4 other authors

11 May 2020

Similar lecture


Practical Parallel Algorithms for Submodular Maximization subject to a Knapsack Constraint with Nearly Optimal Adaptivity

AAAI 2023

Kai Han and 5 other authors

11 February 2023

Stay up to date with the latest Underline news!


  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved