Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation




Authors:

J. de Heuvel, F. Seiler, M. Bennewitz

Type:

Preprint

Published in:

arXiv preprint

Year:

2024

Links:

Preprint

BibTex String

@misc{deheuvel2024enquery,
title={EnQuery: Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation},
author={Jorge de Heuvel and Florian Seiler and Maren Bennewitz},
year={2024},
eprint={2404.04852},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
TopicTopic

Abstract:

To align mobile robot navigation policies with user preferences throughreinforcement learning from human feedback (RLHF), reliable andbehavior-diverse user queries are required. However, deterministic policiesfail to generate a variety of navigation trajectory suggestions for a givennavigation task configuration. We introduce EnQuery, a query generationapproach using an ensemble of policies that achieve behavioral diversitythrough a regularization term. For a given navigation task, EnQuery producesmultiple navigation trajectory suggestions, thereby optimizing the efficiencyof preference data collection with fewer queries. Our methodology demonstratessuperior performance in aligning navigation policies with user preferences inlow-query regimes, offering enhanced policy convergence from sparse preferencequeries. The evaluation is complemented with a novel explainabilityrepresentation, capturing full scene navigation behavior of the mobile robot ina single plot.