"I'm sorry Dave, I'm afraid I can't do that" Deep Q-learning from forbidden action.

Mathieu Seurin Philippe Preux Olivier Pietquin

Published in: CoRR (2019)

Keyphrases

action selection
reinforcement learning
learning algorithm
state action
multi agent
cooperative
state space
function approximation
human actions
decision making
neural network
evaluation function
reinforcement learning algorithms
initial state
potential field
continuous state spaces
reward shaping