Learning Value Functions in Deep Policy Gradients using Residual Variance.

Yannis Flet-Berliac Reda Ouhamma Odalric-Ambrym Maillard Philippe Preux

Published in: ICLR (2021)

Keyphrases

learning algorithm
online learning
neural network
learning process
supervised learning
unsupervised learning
learning tasks
incremental learning
machine learning
artificial neural networks
prior knowledge
active learning
learning problems
learning analytics
action selection