Safe Policy Improvement Approaches on Discrete Markov Decision Processes.
Philipp SchollFelix DietrichClemens OtteSteffen UdluftPublished in: ICAART (2) (2022)
Keyphrases
- markov decision processes
- optimal policy
- policy iteration
- infinite horizon
- markov decision process
- state space
- reinforcement learning
- average cost
- state and action spaces
- partially observable
- continuous state spaces
- finite horizon
- average reward
- finite state
- transition matrices
- reward function
- reachability analysis
- dynamic programming
- total reward
- action space
- long run
- markov decision problems
- decision processes
- planning under uncertainty
- decision theoretic planning
- reinforcement learning algorithms
- control policies
- multistage
- state dependent
- continuous state
- risk sensitive
- machine learning
- partially observable markov decision processes
- sufficient conditions
- model based reinforcement learning