Minimax Regret Bounds for Reinforcement Learning.

Mohammad Gheshlaghi Azar Ian Osband Rémi Munos

Published in: ICML (2017)

Keyphrases

minimax regret
reinforcement learning
reward function
preference elicitation
decision problems
utility function
upper bound
optimal policy
stochastic programming
lower bound
state space
learning problems
learning algorithm
multi agent
optimization criterion
markov decision processes
worst case
special case
misclassification costs
transfer learning
training and test data