Penalized Bootstrapping for Reinforcement Learning in Robot Control.

Publication Authors C. Gebauer; M. Bennewitz
Published in International Conference on Machine Learning and Applications (CMLA)
Year of Publication 2020
Abstract

The recent progress in reinforcement learning algorithms enabled more complex tasks and, at the same time, enforced the need for a careful balance between exploration and exploitation. Enhanced exploration supersedes the requirement to hardly constrain the agent, e.g., with complex reward functions. This seems highly promising as it reduces the work for learning new tasks, while improving the agents performance. In this paper, we address deep exploration in reinforcement learning. Our approach is based on Thompson sampling and keeps multiple hypotheses of the posterior knowledge. We maintain the distribution over the hypotheses by a potential field based penalty function. The resulting policy is more performant in terms of collected reward. Furthermore, is our method faster in application and training than the current state of the art. We evaluate our approach in low-level robot control tasks to back up our claims of a more performant policy and faster training procedure

Type of Publication Conference Proceeding
Lead Image No image
Lead Image Caption
Text
Images
Teaser Image 1
Teaser Image 2 No image
Files and Media
Local Video File
Settings
Versioning enabled yes
Short name penalized-bootstrapping-for-reinforcement-learning-in-robot-control
Layout
Blocks { "45bcc426-f2c6-42ea-87a4-9e6a944bbf50": { "@type": "slate" } }
Blocks Layout { "items": [ "45bcc426-f2c6-42ea-87a4-9e6a944bbf50" ] }
Options
Categorization
Related Items
Contents

There are currently no items in this folder.