Weak Human Preference Supervision for Deep Reinforcement Learning.

Zehong Cao Kaichiu Wong Chin-Teng Lin

Published in: IEEE Trans. Neural Networks Learn. Syst. (2021)

Keyphrases

reinforcement learning
function approximation
learning algorithm
multi agent
learning process
computational models
active learning
supervised learning
optimal policy
optimal control
human operators
soft constraints
autonomous learning