Login / Signup
Policy Regret in Repeated Games.
Raman Arora
Michael Dinitz
Teodor Vanislavov Marinov
Mehryar Mohri
Published in:
NeurIPS (2018)
Keyphrases
</>
repeated games
incomplete information
average reward
stochastic games
optimal policy
reward function
multi armed bandit problems
nash equilibrium
online learning
game theoretic
lower bound
worst case
markov decision processes
genetic algorithm
game theory
expert systems
infinite horizon
markov decision process
search algorithm
multi agent
reinforcement learning
expected reward