Login / Signup
Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards.
Alexander Trott
Stephan Zheng
Caiming Xiong
Richard Socher
Published in:
CoRR (2019)
Keyphrases
</>
reinforcement learning
bandit problems
reward function
feature selection
distance measure
image classification
markov decision processes
machine learning
combinatorial optimization
multiarmed bandit