Model-Free Opponent Shaping.

Christopher Lu Timon Willi Christian Schröder de Witt Jakob N. Foerster

Published in: CoRR (2022)

Keyphrases

model free
reinforcement learning
function approximation
reinforcement learning algorithms
temporal difference
imperfect information
policy iteration
genetic algorithm
multi agent
support vector
dynamic programming
linear combination
average reward
rl algorithms
impedance control