Provable Safe Reinforcement Learning with Binary Feedback.
Andrew BennettDipendra MisraNathan KallusPublished in: AISTATS (2023)
Keyphrases
- stochastic approximation
- reinforcement learning
- monte carlo
- temporal difference learning
- database
- feedback loop
- user feedback
- function approximation
- feedback mechanisms
- assessment tool
- action space
- non binary
- temporal difference
- hamming distance
- transfer learning
- optimal policy
- relevance feedback
- case study
- learning algorithm
- machine learning
- data sets