On Lower Bounds for Regret in Reinforcement Learning.
Ian OsbandBenjamin Van RoyPublished in: CoRR (2016)
Keyphrases
- lower bound
- reinforcement learning
- upper bound
- branch and bound algorithm
- branch and bound
- optimal solution
- worst case
- reinforcement learning algorithms
- function approximation
- objective function
- np hard
- lower and upper bounds
- state space
- markov decision processes
- model free
- regret bounds
- learning algorithm
- upper and lower bounds
- quadratic assignment problem
- optimal policy
- online algorithms
- optimal cost
- machine learning
- vc dimension
- total reward
- learning problems
- learning process
- sample complexity
- optimal control
- temporal difference
- transfer learning
- online learning
- markov decision process
- function approximators
- dynamic programming
- support vector
- expert advice
- multi agent
- e learning