Online Markov decision processes with Kullback-Leibler control cost.
Peng GuanMaxim RaginskyRebecca WillettPublished in: ACC (2012)
Keyphrases
- markov decision processes
- kullback leibler
- average cost
- state space
- transition matrices
- finite state
- optimal policy
- kl divergence
- dynamic programming
- policy iteration
- reinforcement learning
- kullback leibler divergence
- markov decision process
- cross entropy
- average reward
- optimal control
- control strategy
- decision theoretic planning
- infinite horizon
- stochastic shortest path
- covariance matrix
- model selection