Login / Signup
Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor.
Thomas Dueholm Hansen
Peter Bro Miltersen
Uri Zwick
Published in:
J. ACM (2013)
Keyphrases
</>
stochastic games
average reward
markov decision processes
imperfect information
nash equilibria
optimal policy
nash equilibrium
optimal strategy
long run
reinforcement learning algorithms
neural network
objective function
game theoretic
infinite horizon
model free