Minimax Regret Bounds for Reinforcement Learning.
Mohammad Gheshlaghi AzarIan OsbandRémi MunosPublished in: ICML (2017)
Keyphrases
- minimax regret
- reinforcement learning
- reward function
- preference elicitation
- decision problems
- utility function
- upper bound
- optimal policy
- stochastic programming
- lower bound
- state space
- learning problems
- learning algorithm
- multi agent
- optimization criterion
- markov decision processes
- worst case
- special case
- misclassification costs
- transfer learning
- training and test data