A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games.
Wei XiongHan ZhongChengshuai ShiCong ShenTong ZhangPublished in: ICML (2022)
Keyphrases
- sampling algorithm
- markov games
- markov decision processes
- multiagent reinforcement learning
- markov chain monte carlo
- reinforcement learning algorithms
- markov decision process
- reinforcement learning
- control problems
- stochastic games
- random sampling
- metropolis hastings
- state space
- multiagent systems
- markov chain
- multi agent
- posterior probability
- generative model
- optimal policy
- nash equilibrium
- posterior distribution
- probability distribution
- cooperative
- parameter estimation
- particle filter
- model free
- learning algorithm
- infinite horizon
- bayesian inference
- game playing
- machine learning
- dynamic programming
- approximate inference
- temporal difference
- gaussian process
- optimal control
- monte carlo
- nash equilibria
- imperfect information
- probabilistic model
- prior knowledge