Human Preference Scaling with Demonstrations For Deep Reinforcement Learning.
Zehong CaoKaichiu WongChin-Teng LinPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- user preferences
- human interaction
- human subjects
- function approximation
- human behavior
- model free
- neural network
- reinforcement learning algorithms
- human experts
- temporal difference
- autonomous learning
- human operators
- action selection
- human users
- state space
- dynamic programming
- artificial intelligence