Regret-optimal Strategies for Playing Repeated Games with Discounted Losses.
Vijay KamblePatrick LoiseauJean C. WalrandPublished in: CoRR (2016)
Keyphrases
- optimal strategy
- repeated games
- average reward
- decision problems
- games played
- optimal policy
- regret bounds
- stochastic games
- markov decision processes
- game theoretic
- incomplete information
- reward function
- online learning
- imperfect information
- long run
- monte carlo
- nash equilibrium
- lower bound
- infinite horizon
- reinforcement learning
- game theory
- mathematical models
- expected utility
- game playing
- dynamic programming
- worst case
- state space
- game tree
- upper bound
- policy iteration
- finite horizon
- utility function
- finite state
- computer games
- decision making
- artificial intelligence
- learning algorithm
- objective function
- partially observable markov decision processes
- cooperative
- machine learning