Login / Signup
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate.
Yinyu Ye
Published in:
Math. Oper. Res. (2011)
Keyphrases
</>
policy iteration
markov decision processes
markov decision problems
linear programming
optimal policy
infinite horizon
machine learning
reinforcement learning
multi objective
finite state
transition matrices