Do no harm: A counterfactual approach to safe reinforcement learning.

Sean Vaskov Wilko Schwarting Chris L. Baker

Published in: L4DC (2024)

Keyphrases

reinforcement learning
function approximation
state space
temporal difference learning
optimal policy
action selection
robotic control
perceptual aliasing
multi agent
real time
model free
markov decision processes
policy search
logical framework
reinforcement learning algorithms
dynamic programming
learning algorithm
machine learning
learning problems
transfer learning
partially observable
artificial neural networks
learning environment
autonomous learning
information systems