Stationary Anonymous Sequential Games with Undiscounted Rewards.

Piotr Wiecek Eitan Altman

Published in: J. Optim. Theory Appl. (2015)

Keyphrases

markov decision processes
stochastic games
non stationary
nash equilibria
reinforcement learning
computer games
video games
policy iteration
multiarmed bandit
average reward
game design
reinforcement learning algorithms
human computation
infinite horizon
digital games
game playing
game play
nash equilibrium
peer to peer
educational games
imperfect information
coalitional games
markov decision problems
board game
stationary policies
total reward
dynamic programming