Human Preference Scaling with Demonstrations For Deep Reinforcement Learning.

Zehong Cao Kaichiu Wong Chin-Teng Lin

Published in: CoRR (2020)

Keyphrases

reinforcement learning
user preferences
human interaction
human subjects
function approximation
human behavior
model free
neural network
reinforcement learning algorithms
human experts
temporal difference
autonomous learning
human operators
action selection
human users
state space
dynamic programming
artificial intelligence