"I'm Sorry Dave, I'm Afraid I Can't Do That" Deep Q-Learning from Forbidden Actions.
Mathieu SeurinPhilippe PreuxOlivier PietquinPublished in: IJCNN (2020)
Keyphrases
- action selection
- reinforcement learning
- function approximation
- state action space
- cooperative
- learning algorithm
- state space
- multi agent
- reasoning about actions
- state action
- learning agent
- goal directed
- reward function
- plan recognition
- stochastic approximation
- temporal difference
- optimal policy
- data sets
- initial state
- decision theoretic
- situation calculus
- decision making
- bucket brigade
- neural network