Notes on equivalent stationary policies in Markov decision processes with total rewards.
Eugene A. FeinbergIsaac SoninPublished in: Math. Methods Oper. Res. (1996)
Keyphrases
- markov decision processes
- stationary policies
- action sets
- state space
- finite state
- optimal policy
- reinforcement learning
- dynamic programming
- total reward
- reward function
- reinforcement learning algorithms
- policy iteration
- finite horizon
- average cost
- decision processes
- markov decision process
- partially observable
- average reward
- infinite horizon
- action space
- planning under uncertainty
- expected reward
- multi agent