Quantifying Differences in Reward Functions.
Adam GleaveMichael DennisShane LeggStuart RussellJan LeikePublished in: CoRR (2020)
Keyphrases
- reward function
- markov decision processes
- inverse reinforcement learning
- reinforcement learning algorithms
- reinforcement learning
- statistically significant
- multiple agents
- state space
- optimal policy
- policy search
- markov decision process
- initially unknown
- state variables
- clustering algorithm
- objective function
- transition probabilities