Imperative Action Masking for Safe Exploration in Reinforcement Learning.
Sumanta DeySharat BhatPallab DasguptaSoumyajit DeyPublished in: EXTRAAMAS (2023)
Keyphrases
- action selection
- reinforcement learning
- active exploration
- partially observable domains
- action space
- temporal difference
- reward shaping
- exploration exploitation
- exploration strategy
- function approximation
- optimal policy
- state space
- transition model
- model based reinforcement learning
- state action
- autonomous learning
- multi agent
- robotic control
- decision problems
- reinforcement learning algorithms
- neural network
- fitted q iteration
- markov decision process
- partially observable
- information loss
- optimal control
- learning problems
- video sequences