Login / Signup
Provably Efficient Learning of Transferable Rewards.
Alberto Maria Metelli
Giorgia Ramponi
Alessandro Concetti
Marcello Restelli
Published in:
ICML (2021)
Keyphrases
</>
efficient learning
reinforcement learning
markov decision processes
multiarmed bandit
worst case
concept classes
learning algorithm
bandit problems
structured prediction
data mining
learning process
software engineering
reward function