Login / Signup
Better Exploration with Optimistic Actor Critic.
Kamil Ciosek
Quan Vuong
Robert Loftin
Katja Hofmann
Published in:
NeurIPS (2019)
Keyphrases
</>
actor critic
reinforcement learning
optimal control
policy gradient
temporal difference
approximate dynamic programming
neuro fuzzy
gradient method
action selection
policy iteration
function approximation
reinforcement learning algorithms
machine learning
neural network
multi agent systems
step size