Login / Signup
Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay.
Dogan C. Cicek
Enes Duran
Baturay Saglam
Furkan B. Mutlu
Suleyman S. Kozat
Published in:
CoRR (2021)
Keyphrases
</>
learning algorithm
neural network
machine learning algorithms
black box
worst case
gradient ascent
search algorithm
optimization methods