Approximation of stationary control policies by quantized control in Markov decision processes.
Naci SaldiTamás LinderSerdar YükselPublished in: Allerton (2013)
Keyphrases
- control policies
- markov decision processes
- optimal policy
- action space
- finite horizon
- reinforcement learning
- stationary policies
- control policy
- average cost
- state space
- finite state
- reward function
- infinite horizon
- continuous state
- dynamic programming
- transition matrices
- policy iteration
- decision problems
- markov decision process
- multistage
- non stationary
- average reward
- long run
- partially observable
- decision theoretic planning
- initial state
- machine learning
- control strategies
- real valued
- queueing networks
- least squares