Smoothed Action Value Functions for Learning Gaussian Policies.

Ofir Nachum Mohammad Norouzi George Tucker Dale Schuurmans

Published in: ICML (2018)

Keyphrases

supervised learning
learning systems
learning process
prior knowledge
learning algorithm
learning tasks
active learning
learning problems
training data
online learning
optimal policy
learning analytics
elementary school