Login / Signup
"I'm Sorry Dave, I'm Afraid I Can't Do That" Deep Q-Learning from Forbidden Actions.
Mathieu Seurin
Philippe Preux
Olivier Pietquin
Published in:
IJCNN (2020)
Keyphrases
</>
action selection
reinforcement learning
function approximation
state action space
cooperative
learning algorithm
state space
multi agent
reasoning about actions
state action
learning agent
goal directed
reward function
plan recognition
stochastic approximation
temporal difference
optimal policy
data sets
initial state
decision theoretic
situation calculus
decision making
bucket brigade
neural network