Theory of Mind for Deep Reinforcement Learning in Hanabi.
Andrew FuchsMichael WaltonTheresa ChadwickDoug LangePublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- markov decision processes
- state space
- temporal difference
- optimal policy
- learning process
- robotic control
- multi agent reinforcement learning
- multi agent
- machine learning
- temporal difference learning
- deep learning
- autonomous learning
- policy search
- belief nets
- control problems
- reinforcement learning algorithms
- transfer learning
- model free
- learning problems
- data sets
- supervised learning
- dynamic programming
- search space
- multiscale
- information retrieval
- neural network