Provably Efficient Learning of Transferable Rewards.

Alberto Maria Metelli Giorgia Ramponi Alessandro Concetti Marcello Restelli

Published in: ICML (2021)

Keyphrases

efficient learning
reinforcement learning
markov decision processes
multiarmed bandit
worst case
concept classes
learning algorithm
bandit problems
structured prediction
data mining
learning process
software engineering
reward function