Login / Signup
Multi-Preference Actor Critic.
Ishan Durugkar
Matthew J. Hausknecht
Adith Swaminathan
Patrick MacAlpine
Published in:
CoRR (2019)
Keyphrases
</>
actor critic
reinforcement learning
gradient method
policy gradient
temporal difference
optimal control
approximate dynamic programming
function approximation
machine learning
neuro fuzzy
reinforcement learning algorithms