Average Reward Timed Games.

B. Thomas Adler Luca de Alfaro Marco Faella

Published in: FORMATS (2005)

Keyphrases

average reward
stochastic games
markov decision processes
repeated games
long run
optimal policy
markov chain
semi markov decision processes
optimality criterion
reinforcement learning
model free
nash equilibria
policy iteration
discounted reward
game theoretic
game theory
state space
total reward
hierarchical reinforcement learning
state and action spaces
nash equilibrium
reinforcement learning algorithms
incomplete information
random walk
heuristic search
least squares