Transience in Countable MDPs.
Stefan KieferRichard MayrMahsa ShirmohammadiPatrick TotzkePublished in: CONCUR (2021)
Keyphrases
- average cost
- markov decision processes
- reinforcement learning
- finite state
- long run
- state space
- finite horizon
- optimal policy
- markov chain
- factored mdps
- dynamic programming
- finite number
- decision theoretic planning
- infinite horizon
- markov decision process
- multistage
- total cost
- optimal control
- approximate dynamic programming
- action sets
- markov decision problems
- linear programming
- stationary policies
- factored markov decision processes
- state and action spaces
- semi markov decision processes
- policy search
- decision diagrams
- average reward
- decision processes
- policy iteration
- initial state
- linear program
- least squares
- real time dynamic programming
- neural network