Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate.
Haifeng HuoXian WenPublished in: Kybernetika (2021)
Keyphrases
- markov decision processes
- finite horizon
- state space
- risk sensitive
- optimal policy
- optimal stopping
- expected reward
- infinite horizon
- finite state
- average cost
- stationary policies
- markov decision process
- policy iteration
- reinforcement learning
- dynamic programming
- transition matrices
- optimal control
- partially observable
- average reward
- multistage
- probability distribution
- control policies
- markov chain
- decision making
- dynamical systems
- state dependent
- reward function
- action space
- multi agent