Policy Regret in Repeated Games.
Raman AroraMichael DinitzTeodor V. MarinovMehryar MohriPublished in: CoRR (2018)
Keyphrases
- repeated games
- incomplete information
- average reward
- stochastic games
- reward function
- optimal policy
- multi armed bandit problems
- game theoretic
- game theory
- nash equilibrium
- lower bound
- expected reward
- worst case
- markov decision process
- markov decision processes
- online learning
- expert systems
- objective function
- reinforcement learning