Selective imitation on the basis of reward function similarity.
Max Taylor-DaviesStephanie DroopChristopher G. LucasPublished in: CoRR (2023)
Keyphrases
- reward function
- reinforcement learning
- markov decision processes
- similarity measure
- reinforcement learning algorithms
- inverse reinforcement learning
- state space
- optimal policy
- multiple agents
- machine learning
- decision making
- function approximation
- robot navigation
- initially unknown
- hierarchical reinforcement learning