Combination of learning from non-optimal demonstrations and feedbacks using inverse reinforcement learning and Bayesian policy improvement.

Published in: Expert Syst. Appl. (2018)

Keyphrases