A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games.
Wei XiongHan ZhongChengshuai ShiCong ShenTong ZhangPublished in: CoRR (2022)
Keyphrases
- sampling algorithm
- markov games
- markov decision processes
- multiagent reinforcement learning
- markov chain monte carlo
- reinforcement learning algorithms
- markov decision process
- reinforcement learning
- control problems
- stochastic games
- random sampling
- metropolis hastings
- multiagent systems
- multi agent
- posterior distribution
- state space
- cooperative
- probability distribution
- nash equilibrium
- posterior probability
- parameter estimation
- optimal policy
- markov chain
- generative model
- model free
- bayesian framework
- reward function
- game playing
- multi view
- particle filter
- bayesian inference
- monte carlo
- infinite horizon
- learning algorithm
- dynamic programming
- sample size