Login / Signup
An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning.
Hepeng Li
Xiangnan Zhong
Haibo He
Published in:
IJCNN (2023)
Keyphrases
</>
trust region
reinforcement learning
learning algorithm
cost function
dynamic programming
optimization method
least squares