Model-Free Opponent Shaping.

Christopher Lu Timon Willi Christian A. Schröder de Witt Jakob N. Foerster

Published in: ICML (2022)

Keyphrases

model free
reinforcement learning
reinforcement learning algorithms
function approximation
temporal difference
policy iteration
imperfect information
neural network
multi agent
text mining
policy evaluation