Long-term Safe Reinforcement Learning with Binary Feedback.
Akifumi WachiWataru HashimotoKazumune HashimotoPublished in: CoRR (2024)
Keyphrases
- long term
- reinforcement learning
- short term
- function approximation
- model free
- reinforcement learning algorithms
- learning process
- state space
- relevance feedback
- multi agent reinforcement learning
- temporal difference
- markov decision processes
- feedback mechanisms
- multi agent
- policy search
- user feedback
- non binary
- genetic algorithm
- data sets
- medium term
- robotic control
- assessment tool
- temporal difference learning
- action space
- learning capabilities
- digital libraries
- learning environment
- learning algorithm
- machine learning