Deterministic policy optimization with clipped value expansion and long-horizon planning.

Published in: Neurocomputing (2022)

Keyphrases