Login / Signup

The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate.

Yinyu Ye
Published in: Math. Oper. Res. (2011)
Keyphrases
  • policy iteration
  • markov decision processes
  • markov decision problems
  • linear programming
  • optimal policy
  • infinite horizon
  • machine learning
  • reinforcement learning
  • multi objective
  • finite state
  • transition matrices