An Off-Policy Trust Region Policy Optimization Method With Monotonic Improvement Guarantee for Deep Reinforcement Learning.

Published in: IEEE Trans. Neural Networks Learn. Syst. (2022)

Keyphrases