Sign in

An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning.

Hepeng LiXiangnan ZhongHaibo He
Published in: IJCNN (2023)
Keyphrases
  • trust region
  • reinforcement learning
  • learning algorithm
  • cost function
  • dynamic programming
  • optimization method
  • least squares