On undiscounted semi-Markov decision processes with absorbing states.
Prasenjit MondalPublished in: Math. Methods Oper. Res. (2016)
Keyphrases
- semi markov decision processes
- average reward
- markov decision processes
- markov chain
- optimal policy
- markov decision problems
- long run
- stochastic games
- reinforcement learning
- policy iteration
- transition probabilities
- initial state
- state space
- random walk
- finite state
- model free
- dynamic programming
- linear programming
- state variables
- infinite horizon
- partially observable
- decision makers
- markov decision process