Stochastic first-order methods for average-reward Markov decision processes.
Tianjiao LiFeiyang WuGuanghui LanPublished in: CoRR (2022)
Keyphrases
- markov decision processes
- average reward
- optimal policy
- policy iteration
- semi markov decision processes
- reinforcement learning
- finite state
- state space
- long run
- model free
- stochastic games
- planning under uncertainty
- reinforcement learning algorithms
- discounted reward
- decision theoretic planning
- state and action spaces
- multi agent
- optimality criterion
- hierarchical reinforcement learning