Login / Signup
Discovering a set of policies for the worst case reward.
Tom Zahavy
André Barreto
Daniel J. Mankowitz
Shaobo Hou
Brendan O'Donoghue
Iurii Kemaev
Satinder Singh
Published in:
ICLR (2021)
Keyphrases
</>
worst case
small number
data sets
data mining
machine learning
bayesian networks
np hard
probability distribution
input data
databases
information retrieval
reinforcement learning
lower bound
dynamic programming
finite number