Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints.

Siow Meng Low Akshat Kumar

Published in: CoRR (2024)

Keyphrases

reinforcement learning
multi agent
global constraints
function approximation
constraint programming
optimal policy
partially observable domains
reinforcement learning algorithms
markov decision processes
supervised learning
constraint satisfaction
state space
dynamic programming
learned knowledge
previously learned
learning process
temporal difference
learning algorithm
stochastic process
reinforcement learning methods
reinforcement learning agents
data sets