Login / Signup
Variance aware reward smoothing for deep reinforcement learning.
Yunlong Dong
Shengjun Zhang
Xing Liu
Yu Zhang
Tan Shen
Published in:
Neurocomputing (2021)
Keyphrases
</>
reinforcement learning
function approximation
eligibility traces
state space
reinforcement learning algorithms
reward function
optimal policy
transfer learning
temporal difference
reinforcement learning methods
average reward
machine learning
markov decision processes
policy gradient
smoothing methods
learning algorithm
model free
standard deviation
learning problems
supervised learning
multi agent
covariance matrix
partially observable
smoothing algorithm
total reward
optimal control
multiscale
low variance
image smoothing
learning process
probabilistic model
variance reduction
deep learning
policy iteration