Bounds for Synchronizing Markov Decision Processes.
Laurent DoyenMarie van den BogaardPublished in: CSR (2022)
Keyphrases
- markov decision processes
- discounted reward
- state space
- upper bound
- reinforcement learning
- finite state
- transition matrices
- optimal policy
- lower bound
- average cost
- policy iteration
- finite horizon
- dynamic programming
- reinforcement learning algorithms
- reachability analysis
- risk sensitive
- upper and lower bounds
- planning under uncertainty
- factored mdps
- partially observable
- infinite horizon
- decision processes
- decision theoretic planning
- multistage
- model based reinforcement learning
- markov decision process
- action sets
- stationary policies
- search algorithm
- probabilistic planning