Login / Signup
Return-based Scaling: Yet Another Normalisation Trick for Deep RL.
Tom Schaul
Georg Ostrovski
Iurii Kemaev
Diana Borsa
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
function approximation
card games
optimal policy
learning algorithm
markov decision processes
complex domains
text classification
sufficient conditions
multi agent
decision trees
action selection
temporal difference
action space
scaling factors
partially observable domains
machine learning