Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation

Authors:

J. de Heuvel, F. Seiler, M. Bennewitz

Type:

Preprint

Published in:

arXiv preprint

Year:

2024

Links:

Preprint

BibTex String

@misc{deheuvel2024enquery,
      title={EnQuery: Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation}, 
      author={Jorge de Heuvel and Florian Seiler and Maren Bennewitz},
      year={2024},
      eprint={2404.04852},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Abstract:

To align mobile robot navigation policies with user preferences throughreinforcement learning from human feedback (RLHF), reliable andbehavior-diverse user queries are required. However, deterministic policiesfail to generate a variety of navigation trajectory suggestions for a givennavigation task configuration. We introduce EnQuery, a query generationapproach using an ensemble of policies that achieve behavioral diversitythrough a regularization term. For a given navigation task, EnQuery producesmultiple navigation trajectory suggestions, thereby optimizing the efficiencyof preference data collection with fewer queries. Our methodology demonstratessuperior performance in aligning navigation policies with user preferences inlow-query regimes, offering enhanced policy convergence from sparse preferencequeries. The evaluation is complemented with a novel explainabilityrepresentation, capturing full scene navigation behavior of the mobile robot ina single plot.