Investigation on the generalization of the Sampled Policy Gradient algorithm.

Nil Stolt Ansó

Published in: CoRR (2019)

Keyphrases

np hard
computational complexity
objective function
search space
worst case
policy gradient
learning algorithm
optimal solution
dynamic programming
mathematical model
path planning
neural network
multi agent
multi agent systems
mobile robot
monte carlo