Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards.
Xianping GuoXi-Ren CaoPublished in: SIAM J. Control. Optim. (2005)
Keyphrases
- optimal control
- sample path
- markov chain
- policy iteration
- reinforcement learning
- markov processes
- average cost
- markov decision processes
- infinite horizon
- dynamic programming
- state space
- markov process
- control problems
- average reward
- steady state
- finite state
- control strategy
- random walk
- stationary distribution
- optimal policy
- lost sales
- finite horizon
- transition probabilities
- function approximation
- queueing systems
- stochastic process
- stationary points
- mathematical model
- fixed point
- probabilistic model