Preference-based Reinforcement Learning with Finite-Time Guarantees.

Yichong Xu Ruosong Wang Lin F. Yang Aarti Singh Artur Dubrawski

Published in: CoRR (2020)

Keyphrases

reinforcement learning
state and action spaces
function approximation
model free
robotic control
temporal difference
finite number
optimal policy
markov decision processes
machine learning
learning capabilities
user preferences
multi agent
learning algorithm
optimal control
transfer learning
logic programs
finite automata
multi agent reinforcement learning
data sets
unit length