Login / Signup
Off-Policy Actor-Critic with Shared Experience Replay.
Simon Schmitt
Matteo Hessel
Karen Simonyan
Published in:
CoRR (2019)
Keyphrases
</>
actor critic
reinforcement learning
optimal control
policy gradient
neuro fuzzy
reinforcement learning algorithms
gradient method
approximate dynamic programming
cost function
fuzzy logic
function approximation
temporal difference
average reward
np hard
sufficient conditions
dynamic environments