Intrinsic fluctuations of reinforcement learning promote cooperation.

Wolfram Barfuss Janusz Meylahn

Published in: CoRR (2022)

Keyphrases

reinforcement learning
multi agent
cooperative
function approximation
learning algorithm
state space
model free
learning process
machine learning
markov decision processes
function approximators
reinforcement learning algorithms
action selection
optimal control
multi agent systems
data sets
learning problems
transfer learning
optimal policy
temporal difference
temporal difference learning
information sharing
information exchange
dynamic programming
control system
learning capabilities
real robot
multi agent reinforcement learning
direct policy search