Estimates for perturbations of average Markov decision processes with a minimal state and upper bounded by stochastically ordered Markov chains.
Raúl Montes-de-OcaFrancisco Salem-SilvaPublished in: Kybernetika (2005)
Keyphrases
- markov decision processes
- state space
- markov chain
- finite state
- average cost
- transition probabilities
- confidence intervals
- optimal policy
- reinforcement learning
- discounted reward
- average reward
- action space
- dynamic programming
- markov decision process
- state variables
- steady state
- stationary distribution
- state abstraction
- transition matrices
- initial state
- search space
- partially observable
- random walk
- monte carlo
- reward function
- action sets
- decision theoretic planning
- policy iteration
- markov processes
- state transition
- stochastic process
- finite horizon
- state dependent
- markov model
- transition matrix
- continuous state
- stationary policies
- belief state
- infinite horizon
- machine learning
- dynamical systems
- planning problems
- markov decision problems
- graphical models
- long run