Login / Signup
Learning to Constrain Policy Optimization with Virtual Trust Region.
Thai Hung Le
Thommen Karimpanal George
Majid Abdolshah
Dung Nguyen
Kien Do
Sunil Gupta
Svetha Venkatesh
Published in:
NeurIPS (2022)
Keyphrases
</>
learning algorithm
trust region
cost function
image sequences
reinforcement learning
convergence speed