Login / Signup
An Improved Off-Policy Actor-Critic Algorithm with Historical Behaviors Reusing for Robotic Control.
Huaqing Zhang
Hongbin Ma
Ying Jin
Published in:
ICIRA (4) (2022)
Keyphrases
</>
learning algorithm
cost function
objective function
np hard
dynamic programming
monte carlo
robotic control
reinforcement learning
search space
linear programming
mathematical model
path planning