Sufficiency of Markov Policies for Continuous-Time Jump Markov Decision Processes.
Eugene A. FeinbergManasa MandavaAlbert N. ShiryaevPublished in: Math. Oper. Res. (2022)
Keyphrases
- markov decision processes
- markov chain
- optimal policy
- state space
- finite state
- stationary policies
- markov decision process
- policy iteration algorithm
- markov processes
- decision processes
- average cost
- reward function
- markov model
- decentralized control
- dynamic programming
- transition probabilities
- reinforcement learning
- total reward
- partially observable markov decision processes
- markov decision problems
- discounted reward
- finite horizon
- decision problems
- transition matrices
- expected reward
- policy iteration
- infinite horizon
- reinforcement learning algorithms
- action sets
- macro actions
- partially observable
- average reward
- dynamical systems
- decision theoretic planning
- factored mdps
- planning under uncertainty
- reachability analysis
- state abstraction
- multistage
- control policies
- action space
- long run
- planning problems
- state and action spaces
- initial state
- model based reinforcement learning