Login / Signup
Comments on: "A policy improvement method for constrained average Markov decision processes" [ORL 35 (2007) 434-438].
Yasemin Serin
Published in:
Oper. Res. Lett. (2009)
Keyphrases
</>
markov decision processes
optimal policy
dynamic programming
policy iteration
reinforcement learning
data mining
convergence rate
infinite horizon
average cost
control policy
discounted reward