Online Learning in Kernelized Markov Decision Processes.
Sayak Ray ChowdhuryAditya GopalanPublished in: AISTATS (2019)
Keyphrases
- markov decision processes
- online learning
- state space
- finite state
- optimal policy
- transition matrices
- reinforcement learning
- policy iteration
- e learning
- finite horizon
- active learning
- average reward
- partially observable
- decision processes
- dynamic programming
- factored mdps
- decision theoretic planning
- average cost
- reinforcement learning algorithms
- infinite horizon
- planning under uncertainty
- reachability analysis
- action sets
- model based reinforcement learning
- kernel function
- multi agent
- action space
- decision theoretic
- state and action spaces