Discovering a set of policies for the worst case reward.

Tom Zahavy André Barreto Daniel J. Mankowitz Shaobo Hou Brendan O'Donoghue Iurii Kemaev Satinder Singh

Published in: ICLR (2021)

Keyphrases