Learning in Two-Player Matrix Games by Policy Gradient Lagging Anchor.

Shiyao Ding Toshimitsu Ushio

Published in: IEICE Trans. Fundam. Electron. Commun. Comput. Sci. (2019)

Keyphrases

learning agents
learning process
stochastic games
policy gradient
learning algorithm
supervised learning
learning problems
game theoretic
dynamic environments
learning tasks
optimization methods
actor critic
model free reinforcement learning