Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies.
Yonathan EfroniNadav MerlisMohammad GhavamzadehShie MannorPublished in: NeurIPS (2019)
Keyphrases
- model based reinforcement learning
- regret bounds
- lower bound
- markov decision processes
- optimal policy
- upper bound
- markov decision problems
- dynamic programming
- partially observable markov decision processes
- reinforcement learning
- worst case
- markov decision process
- feature selection
- average cost
- bayesian networks
- linear regression
- machine learning
- sufficient conditions
- maximum likelihood
- online learning
- np hard
- optimal solution
- objective function