Login / Signup

An Off-Policy Trust Region Policy Optimization Method With Monotonic Improvement Guarantee for Deep Reinforcement Learning.

Wenjia MengQian ZhengYue ShiGang Pan
Published in: IEEE Trans. Neural Networks Learn. Syst. (2022)
Keyphrases