Forward Recursion for Markov Decision Processes with Skip-Free-to-the-Right Transitions, Part I: Theory and Algorithm.
Jacob WijngaardShaler Stidham Jr.Published in: Math. Oper. Res. (1986)
Keyphrases
- markov decision processes
- dynamic programming
- model based reinforcement learning
- np hard
- learning algorithm
- computational complexity
- search space
- state space
- search algorithm
- optimal policy
- reinforcement learning
- policy iteration
- decision theoretic
- finite state
- average reward
- probabilistic planning
- incremental algorithms
- multi agent
- transition matrices