Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigationhttps://www.hrl.uni-bonn.de/api/publications/2024/deheuvel24romanhttps://www.hrl.uni-bonn.de/api/++resource++plone-logo.svg
Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation
Publication Authors
J. de Heuvel;
F. Seiler;
M. Bennewitz
Published in
IEEE International on Human & Robot Interactive Communication (RO-MAN)
Year of Publication
2024
Abstract
To align mobile robot navigation policies with user preferences through
reinforcement learning from human feedback (RLHF), reliable and
behavior-diverse user queries are required. However, deterministic policies
fail to generate a variety of navigation trajectory suggestions for a given
navigation task configuration. We introduce EnQuery, a query generation
approach using an ensemble of policies that achieve behavioral diversity
through a regularization term. For a given navigation task, EnQuery produces
multiple navigation trajectory suggestions, thereby optimizing the efficiency
of preference data collection with fewer queries. Our methodology demonstrates
superior performance in aligning navigation policies with user preferences in
low-query regimes, offering enhanced policy convergence from sparse preference
queries. The evaluation is complemented with a novel explainability
representation, capturing full scene navigation behavior of the mobile robot in