On Polynomial Time PAC Reinforcement Learning with Rich Observations.
Christoph DannNan JiangAkshay KrishnamurthyAlekh AgarwalJohn LangfordRobert E. SchapirePublished in: CoRR (2018)
Keyphrases
- reinforcement learning
- special case
- statistical queries
- function approximation
- state space
- multi agent
- real world
- worst case
- real robot
- optimal control
- markov decision processes
- noise tolerant
- optimal policy
- computational complexity
- high level
- learning algorithm
- deterministic domains
- transfer learning
- dnf formulas
- action space
- pac learning
- temporal difference
- action selection
- approximation algorithms
- sample size
- supervised learning
- machine learning