Solving multichain stochastic games with mean payoff by policy iteration.
Marianne AkianJean Cochet-TerrassonSylvie DetournayStephane GaubertPublished in: CDC (2013)
Keyphrases
- stochastic games
- average reward
- policy iteration
- markov decision processes
- repeated games
- optimal policy
- reinforcement learning
- model free
- fixed point
- long run
- multiagent reinforcement learning
- state space
- finite state
- least squares
- temporal difference
- infinite horizon
- reinforcement learning algorithms
- markov decision process
- linear programming
- convergence rate
- game theory
- optimal control
- reward function
- sufficient conditions
- finite horizon
- markov chain
- cooperative