Login / Signup
What Can Learned Intrinsic Rewards Capture?
Zeyu Zheng
Junhyuk Oh
Matteo Hessel
Zhongwen Xu
Manuel Kroiss
Hado van Hasselt
David Silver
Satinder Singh
Published in:
CoRR (2019)
Keyphrases
</>
reinforcement learning
multiarmed bandit
learning phase
long term and short term
database
data mining
genetic algorithm
markov decision processes
geometric structure
bandit problems