Steady-State Planning in Expected Reward Multichain MDPs.
George K. AtiaAndre BeckusIsmail AlkhouriAlvaro VelasquezPublished in: J. Artif. Intell. Res. (2021)
Keyphrases
- steady state
- markov decision processes
- expected reward
- optimal policy
- markov chain
- finite horizon
- state space
- average reward
- partially observable
- partially observable markov decision processes
- planning problems
- markov decision problems
- finite state
- reinforcement learning
- dynamic programming
- product form
- policy iteration
- state dependent
- heuristic search
- steady states
- reward function
- explicit expressions
- queueing networks
- infinite horizon
- queue length
- initial state
- markov decision process
- service times
- average cost
- action space
- decision problems
- heavy traffic
- transition probabilities
- arrival rate
- dynamical systems
- multistage
- linear programming
- special case
- search algorithm
- multi agent