Login / Signup
MP-TD3: Multi-Pool Prioritized Experience Replay-Based Asynchronous Twin Delayed Deep Deterministic Policy Gradient Algorithm.
Wenwen Tan
Detian Huang
Published in:
IEEE Access (2024)
Keyphrases
</>
learning algorithm
computational complexity
dynamic programming
cost function
optimal solution
search space
worst case
objective function
np hard
reinforcement learning
optimization method
gradient ascent
natural gradient