Provably Safe PAC-MDP Exploration Using Analogies.
Melrose RoderickVaishnavh NagarajanJ. Zico KolterPublished in: AISTATS (2021)
Keyphrases
- markov decision processes
- exploration strategy
- reinforcement learning
- state space
- markov decision process
- utility function
- upper bound
- computational model
- planning under uncertainty
- sample complexity
- sample size
- dynamic programming
- vc dimension
- machine learning
- dynamic programming algorithms
- noise tolerant
- analogical reasoning
- sample complexity bounds
- average case
- biological systems
- finite state
- computational models
- semantic relations
- linear programming
- search algorithm