Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration.
Marianne AkianStéphane GaubertZheng QuOmar SaadiPublished in: CDC (2019)
Keyphrases
- stochastic games
- markov decision processes
- subgame perfect equilibrium
- perfect information
- average reward
- optimal policy
- state space
- stochastic shortest path
- finite state
- multiagent reinforcement learning
- reinforcement learning algorithms
- policy iteration
- reinforcement learning
- markov decision problems
- dynamic programming
- markov chain
- markov decision process
- average cost
- infinite horizon
- imperfect information
- partially observable
- combinatorial optimization
- reward function
- solving problems