Login / Signup

Sustainable ℓ2-regularized actor-critic based on recursive least-squares temporal difference learning.

Luntong LiDazi LiTianheng Song
Published in: SMC (2017)
Keyphrases