Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game.
Haobo FuWeiming LiuShuang WuYijia WangTao YangKai LiJunliang XingBin LiBo MaQiang FuWei YangPublished in: ICLR (2022)
Keyphrases
- imperfect information
- actor critic
- policy gradient
- game theoretic
- game playing
- reinforcement learning
- perfect information
- game tree search
- game theory
- approximate dynamic programming
- card game
- optimal control
- gradient method
- temporal difference
- game tree
- imperfect information games
- optimization problems
- neuro fuzzy
- optimal policy
- policy iteration
- average reward
- nash equilibrium
- stochastic games
- decision problems
- multi agent systems
- evaluation function
- markov decision processes
- resource allocation
- multi agent