Quantifying Differences in Reward Functions.
Adam GleaveMichael DennisShane LeggStuart RussellJan LeikePublished in: ICLR (2021)
Keyphrases
- reward function
- inverse reinforcement learning
- reinforcement learning
- state space
- statistically significant
- multiple agents
- markov decision processes
- policy search
- optimal policy
- reinforcement learning algorithms
- transition probabilities
- search algorithm
- markov decision process
- machine learning
- pairwise
- linear programming
- dynamic systems
- state variables