Login / Signup
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages.
Andrew Jesson
Chris Lu
Gunshi Gupta
Angelos Filos
Jakob Nicolaus Foerster
Yarin Gal
Published in:
CoRR (2023)
Keyphrases
</>
actor critic
reinforcement learning
policy gradient
temporal difference
approximate dynamic programming
neural network
decision making
optimal policy
neuro fuzzy
average reward
gradient method
machine learning
learning algorithm
multi agent
dynamical systems