Theory of Mind for Deep Reinforcement Learning in Hanabi.

Andrew Fuchs Michael Walton Theresa Chadwick Doug Lange

Published in: CoRR (2021)

Keyphrases

reinforcement learning
function approximation
learning algorithm
markov decision processes
state space
temporal difference
optimal policy
learning process
robotic control
multi agent reinforcement learning
multi agent
machine learning
temporal difference learning
deep learning
autonomous learning
policy search
belief nets
control problems
reinforcement learning algorithms
transfer learning
model free
learning problems
data sets
supervised learning
dynamic programming
search space
multiscale
information retrieval
neural network