Actor-only Deterministic Policy Gradient via Zeroth-order Gradient Oracles in Action Space.

Published in: ISIT (2021)

Keyphrases