Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games.
Kittiphon PhalakarnToru TakisakaThomas HaasIchiro HasuoPublished in: CoRR (2020)
Keyphrases
- stochastic games
- markov decision processes
- average reward
- nash equilibria
- multiagent reinforcement learning
- state space
- infinite horizon
- multi agent
- dynamic programming
- repeated games
- reinforcement learning algorithms
- single agent
- optimal policy
- reinforcement learning
- nash equilibrium
- average cost
- heuristic search
- learning automata
- policy iteration
- game theoretic
- path finding
- markov decision process