On the First Passage g-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes.
Xianping GuoXiangxiang HuangYi ZhangPublished in: SIAM J. Control. Optim. (2015)
Keyphrases
- markov decision processes
- stationary policies
- average cost
- state space
- average reward
- action sets
- finite state
- optimal policy
- dynamic programming
- reinforcement learning
- infinite horizon
- policy iteration
- optimal control
- finite horizon
- reachability analysis
- total reward
- reinforcement learning algorithms
- markov decision process
- transition matrices
- action space
- decision theoretic planning
- markov chain
- planning under uncertainty
- reward function
- sufficient conditions
- model based reinforcement learning
- optimality criterion
- risk sensitive
- lot sizing
- decision processes
- factored mdps
- utility function
- state abstraction
- partially observable
- optimal solution
- multistage
- dynamical systems
- linear program
- planning problems
- stochastic shortest path
- machine learning
- initial state
- long run
- decision making