A Closer Look at Invalid Action Masking in Policy Gradient Algorithms.

Shengyi Huang Santiago Ontañón

Published in: FLAIRS (2022)

Keyphrases

computational complexity
gradient ascent
machine learning algorithms
neural network
reinforcement learning
worst case
optimization problems