Strongly Polynomial Algorithms for Transient and Average-Cost MDPs.
Eugene A. FeinbergJefferson HuangPublished in: SIGMETRICS Perform. Evaluation Rev. (2017)
Keyphrases
- average cost
- markov decision processes
- optimal policy
- policy iteration
- long run
- learning algorithm
- optimal control
- optimization problems
- worst case
- finite number
- infinite horizon
- approximate dynamic programming
- minimum cost flow
- linear programming
- linear program
- finite state
- scheduling problem
- reinforcement learning
- machine learning
- markov decision chains