Login / Signup
Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods.
Chris Nota
Philip Thomas
Bruno C. Da Silva
Published in:
ICML (2021)
Keyphrases
</>
policy gradient methods
natural actor critic
multi agent
search space
probabilistic model
posterior distribution
neural network
radial basis function
robot arm