Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes.
Mohammed Shahid AbdullaShalabh BhatnagarPublished in: Discret. Event Dyn. Syst. (2007)
Keyphrases
- markov decision processes
- average cost
- policy iteration
- reinforcement learning
- optimal policy
- factored mdps
- finite state
- state space
- action sets
- dynamic programming
- partially observable
- reinforcement learning algorithms
- reachability analysis
- infinite horizon
- decision theoretic planning
- policy evaluation
- model based reinforcement learning
- markov decision process
- transition matrices
- state and action spaces
- continuous state spaces
- average reward
- approximate dynamic programming
- learning algorithm
- optimal control
- partially observable markov decision processes
- risk sensitive
- finite horizon
- stationary policies
- initial state
- reward function
- model free
- stochastic shortest path
- discounted reward
- state abstraction
- function approximators
- temporal difference
- long run
- search space
- computational complexity
- decision making