Learning Altruistic Behaviours in Reinforcement Learning without External Rewards.

Tim Franzmeyer Mateusz Malinowski João F. Henriques

Published in: CoRR (2021)

Keyphrases

reinforcement learning
learning algorithm
learning process
reinforcement learning methods
machine learning
state space
online learning
knowledge acquisition
neural network
optimal policy
reinforcement learning algorithms
learning tasks
markov decision processes
learning systems
unsupervised learning
function approximation
supervised learning
state action
learning agents
temporal difference learning
dynamic programming
rl algorithms
eligibility traces