Double Thompson Sampling in Finite stochastic Games.
Shuqing ShiXiaobin WangZhiyou YangFan ZhangHong QuPublished in: CoRR (2022)
Keyphrases
- stochastic games
- nash equilibria
- games with incomplete information
- multiagent reinforcement learning
- markov decision processes
- nash equilibrium
- average reward
- learning automata
- robust optimization
- infinite horizon
- repeated games
- reinforcement learning algorithms
- finite number
- multi agent
- monte carlo
- game theory
- long run
- optimal control
- optimal policy