Do no harm: A counterfactual approach to safe reinforcement learning.
Sean VaskovWilko SchwartingChris L. BakerPublished in: L4DC (2024)
Keyphrases
- reinforcement learning
- function approximation
- state space
- temporal difference learning
- optimal policy
- action selection
- robotic control
- perceptual aliasing
- multi agent
- real time
- model free
- markov decision processes
- policy search
- logical framework
- reinforcement learning algorithms
- dynamic programming
- learning algorithm
- machine learning
- learning problems
- transfer learning
- partially observable
- artificial neural networks
- learning environment
- autonomous learning
- information systems