Robust optimal policies for Markov decision processes with safety-threshold constraints.
Rayna DimitrovaJie FuUfuk TopcuPublished in: CDC (2016)
Keyphrases
- partially observable markov decision processes
- markov decision processes
- optimal policy
- finite state
- state space
- decision problems
- reinforcement learning
- average reward
- dynamic programming
- partially observable
- finite horizon
- infinite horizon
- policy iteration
- long run
- average cost
- transition matrices
- multistage
- markov decision process
- decision processes
- state dependent
- reinforcement learning algorithms
- sufficient conditions
- state and action spaces
- initial state
- action space
- reward function
- semi markov decision processes
- machine learning
- data mining
- total reward
- long run average cost