Long-Term Safe Reinforcement Learning with Binary Feedback.
Akifumi WachiWataru HashimotoKazumune HashimotoPublished in: AAAI (2024)
Keyphrases
- long term
- reinforcement learning
- short term
- function approximation
- non binary
- partially observable
- reinforcement learning algorithms
- temporal difference
- optimal policy
- relevance feedback
- state space
- robotic control
- markov decision processes
- dynamic programming
- user feedback
- information systems
- medium term
- feedback mechanisms
- multi agent reinforcement learning
- sensory inputs
- learning tasks
- action selection
- real time
- transfer learning
- machine learning
- data sets