Improvement of prioritized experience replay mechanism based on deep deterministic policy gradient algorithm.

Xin Zhang Yihuan Xu

Published in: AIPR (2023)

Keyphrases

dynamic programming
policy gradient
worst case
monte carlo
neural network
learning algorithm
optimal solution
cost function
np hard
convergence rate
search space
evolutionary algorithm
k means
state space
actor critic
natural gradient