Policy iteration for parameterized Markov decision processes and its application.
Li XiaQing-Shan JiaPublished in: ASCC (2013)
Keyphrases
- policy iteration
- markov decision processes
- optimal policy
- finite state
- sample path
- state space
- approximate dynamic programming
- factored mdps
- dynamic programming
- average reward
- reinforcement learning
- policy evaluation
- planning under uncertainty
- transition matrices
- markov decision problems
- decision processes
- action space
- markov decision process
- infinite horizon
- average cost
- fixed point
- finite horizon
- partially observable
- stochastic games
- reinforcement learning algorithms
- temporal difference
- model free
- least squares
- state and action spaces
- machine learning