Approximation and adaptive control of Markov processes: Average reward criterion.
Onésimo Hernández-LermaPublished in: Kybernetika (1987)
Keyphrases
- adaptive control
- markov processes
- average reward
- optimality criterion
- markov chain
- reinforcement learning
- markov decision processes
- markov process
- long run
- control method
- optimal policy
- steady state
- stochastic processes
- dynamic environments
- transition probabilities
- model free
- state space
- queueing networks
- finite state
- random walk
- policy iteration
- non stationary
- control law
- markov model
- stationary distribution
- stochastic process
- random variables
- partially observable markov decision processes
- path planning
- machine learning