Solving Long-run Average Reward Robust MDPs via Stochastic Games.
Krishnendu ChatterjeeEhsan Kafshdar GoharshadyMehrdad KarrabiPetr NovotnýDorde ZikelicPublished in: CoRR (2023)
Keyphrases
- average reward
- long run
- stochastic games
- semi markov decision processes
- markov decision processes
- optimal policy
- infinite horizon
- discounted reward
- average cost
- repeated games
- state space
- state action
- queueing networks
- reinforcement learning
- policy iteration
- sufficient conditions
- initial state
- markov decision problems