Steady-State Policy Synthesis in Multichain Markov Decision Processes.
George K. AtiaAndre BeckusIsmail AlkhouriAlvaro VelasquezPublished in: IJCAI (2020)
Keyphrases
- steady state
- markov decision processes
- optimal policy
- average reward
- policy iteration
- markov chain
- state space
- state dependent
- markov decision process
- infinite horizon
- finite horizon
- average cost
- reinforcement learning
- partially observable
- finite state
- decision problems
- state and action spaces
- reward function
- dynamic programming
- decision processes
- discounted reward
- action space
- total reward
- product form
- queue length
- transition matrices
- markov decision problems
- partially observable markov decision processes
- arrival rate
- fluid model
- expected reward
- queueing networks
- decision theoretic planning
- sufficient conditions
- continuous state spaces
- service times
- steady states
- policy evaluation
- multistage
- heavy traffic
- stationary policies
- long run
- initial state
- asymptotically optimal
- planning problems
- queue size
- model free