Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints.
Siow Meng LowAkshat KumarPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- multi agent
- global constraints
- function approximation
- constraint programming
- optimal policy
- partially observable domains
- reinforcement learning algorithms
- markov decision processes
- supervised learning
- constraint satisfaction
- state space
- dynamic programming
- learned knowledge
- previously learned
- learning process
- temporal difference
- learning algorithm
- stochastic process
- reinforcement learning methods
- reinforcement learning agents
- data sets