Login / Signup
A Policy Gradient Method for Task-Agnostic Exploration.
Mirco Mutti
Lorenzo Pratissoli
Marcello Restelli
Published in:
CoRR (2020)
Keyphrases
</>
gradient method
policy gradient
actor critic
convergence rate
action selection
step size
negative matrix factorization
convex formulation
optimization methods
optimal policy
data sets
multiscale