Login / Signup
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods.
Wei Zhou
Yiying Li
Yongxin Yang
Huaimin Wang
Timothy M. Hospedales
Published in:
CoRR (2020)
Keyphrases
</>
actor critic
gradient method
reinforcement learning
learning algorithm
active learning
learning problems
optimal control
temporal difference
policy iteration
neuro fuzzy
approximate dynamic programming
machine learning algorithms
optimization methods
function approximation
policy gradient