Long-Term Safe Reinforcement Learning with Binary Feedback.

Akifumi Wachi Wataru Hashimoto Kazumune Hashimoto

Published in: AAAI (2024)

Keyphrases

long term
reinforcement learning
short term
function approximation
non binary
partially observable
reinforcement learning algorithms
temporal difference
optimal policy
relevance feedback
state space
robotic control
markov decision processes
dynamic programming
user feedback
information systems
medium term
feedback mechanisms
multi agent reinforcement learning
sensory inputs
learning tasks
action selection
real time
transfer learning
machine learning
data sets