Login / Signup
Off-Policy Actor-Critic with Shared Experience Replay.
Simon Schmitt
Matteo Hessel
Karen Simonyan
Published in:
ICML (2020)
Keyphrases
</>
actor critic
reinforcement learning
optimal control
policy gradient
approximate dynamic programming
gradient method
neuro fuzzy
temporal difference
reinforcement learning algorithms
decision making
dynamic programming
sufficient conditions
policy iteration