Coupling based estimation approaches for the average reward performance potential in Markov chains.
Yanjie LiXinyu WuYunjiang LouHaoyao ChenJiangang LiPublished in: Autom. (2018)
Keyphrases
- markov chain
- average reward
- steady state
- finite state
- sample path
- long run
- transition probabilities
- random walk
- monte carlo
- markov decision processes
- markov model
- semi markov decision processes
- state space
- optimal policy
- model free
- decision problems
- optimality criterion
- markov processes
- reinforcement learning
- least squares
- search algorithm