Quantifying Differences in Reward Functions.

Adam Gleave Michael Dennis Shane Legg Stuart Russell Jan Leike

Published in: ICLR (2021)

Keyphrases

reward function
inverse reinforcement learning
reinforcement learning
state space
statistically significant
multiple agents
markov decision processes
policy search
optimal policy
reinforcement learning algorithms
transition probabilities
search algorithm
markov decision process
machine learning
pairwise
linear programming
dynamic systems
state variables