Learning to Explore with Meta-Policy Gradient.

Tianbing Xu Qiang Liu Liang Zhao Jian Peng

Published in: CoRR (2018)

Keyphrases

policy gradient
learning algorithm
reinforcement learning
learning process
supervised learning
learning tasks
learning problems
function approximation