Selective imitation on the basis of reward function similarity.

Max Taylor-Davies Stephanie Droop Christopher G. Lucas

Published in: CoRR (2023)

Keyphrases

reward function
reinforcement learning
markov decision processes
similarity measure
reinforcement learning algorithms
inverse reinforcement learning
state space
optimal policy
multiple agents
machine learning
decision making
function approximation
robot navigation
initially unknown
hierarchical reinforcement learning