A Closer Look at Invalid Action Masking in Policy Gradient Algorithms.

Shengyi Huang Santiago Ontañón

Published in: CoRR (2020)

Keyphrases

gradient ascent
neural network
computational complexity
learning algorithm
worst case
sufficient conditions