State Classification of Time-Nonhomogeneous Markov Chains and Average Reward Optimization of Multi-Chains.
Xi-Ren CaoPublished in: IEEE Trans. Autom. Control. (2016)
Keyphrases
- markov chain
- average reward
- transition probabilities
- state space
- steady state
- markov decision processes
- sample path
- finite state
- optimal policy
- monte carlo
- discounted reward
- state action
- markov model
- random walk
- machine learning
- markov processes
- reward function
- stationary distribution
- multi agent
- transition matrix
- state variables
- markov decision process
- support vector
- probabilistic automata
- reinforcement learning