Login / Signup
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning.
Haoran Xu
Xianyuan Zhan
Xiangyu Zhu
Published in:
AAAI (2022)
Keyphrases
</>
reinforcement learning
function approximation
reinforcement learning algorithms
state space
multi agent
model free
action selection
least squares
optimal policy
continuous state and action spaces
reinforcement learning methods
temporal difference learning
learning algorithm
loss function
temporal difference
cooperative
relational reinforcement learning
monte carlo
maximum likelihood
control problems
constraint satisfaction
multi agent reinforcement learning
stochastic approximation
markov decision process
neural network
multiagent learning
state action
transfer learning
constraint programming