Solving multichain stochastic games with mean payoff by policy iteration.

Marianne Akian Jean Cochet-Terrasson Sylvie Detournay Stephane Gaubert

Published in: CDC (2013)

Keyphrases

stochastic games
average reward
policy iteration
markov decision processes
repeated games
optimal policy
reinforcement learning
model free
fixed point
long run
multiagent reinforcement learning
state space
finite state
least squares
temporal difference
infinite horizon
reinforcement learning algorithms
markov decision process
linear programming
convergence rate
game theory
optimal control
reward function
sufficient conditions
finite horizon
markov chain
cooperative