Login / Signup
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective.
Ting-Han Fan
Peter J. Ramadge
Published in:
CoRR (2021)
Keyphrases
</>
bias variance
actor critic
trade off
reinforcement learning
temporal difference
optimal control
policy gradient
approximate dynamic programming
neuro fuzzy
function approximation
reinforcement learning algorithms
gradient method
low variance