Smoothed Action Value Functions for Learning Gaussian Policies.

Ofir Nachum Mohammad Norouzi George Tucker Dale Schuurmans

Published in: CoRR (2018)

Keyphrases

learning process
learning algorithm
supervised learning
online learning
prior knowledge
knowledge acquisition
learning tasks
data sets
multiscale
reinforcement learning
collaborative learning
learning systems
learning experience
background knowledge
lead time