Login / Signup
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality.
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
Published in:
ICML (2021)
Keyphrases
</>
convergence proof
actor critic
least squares
neural network
learning algorithm
gradient method
lyapunov stability