Constrained Markov decision processes in Borel spaces: from discounted to average optimality.
Armando F. Mendoza-PérezHéctor Jasso-FuentesOmar A. De-la-Cruz CourtoisPublished in: Math. Methods Oper. Res. (2016)
Keyphrases
- markov decision processes
- average cost
- stationary policies
- average reward
- finite state
- optimal policy
- infinite horizon
- discounted reward
- reinforcement learning
- state space
- finite horizon
- dynamic programming
- action sets
- policy iteration
- transition matrices
- reinforcement learning algorithms
- decision theoretic planning
- model based reinforcement learning
- risk sensitive
- planning under uncertainty
- partially observable
- initial state
- markov decision process
- factored mdps
- action space
- reward function
- long run
- semi markov decision processes
- decision processes
- total reward
- reachability analysis