Minimax Regret Bounds for Reinforcement Learning.

Mohammad Gheshlaghi Azar Ian Osband Rémi Munos

Published in: CoRR (2017)

Keyphrases

minimax regret
reinforcement learning
reward function
preference elicitation
utility function
decision problems
upper bound
lower bound
state space
stochastic programming
optimal policy
worst case
markov decision processes
misclassification costs
multi agent
learning algorithm
transfer learning
multi class
machine learning
data sets
dynamic programming
cost sensitive
learning process