Batch Reinforcement Learning With a Nonparametric Off-Policy Policy Gradient.

Samuele Tosatto João Carvalho Jan Peters

Published in: IEEE Trans. Pattern Anal. Mach. Intell. (2022)

Keyphrases