An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions.

Published in: Allerton (2019)

Keyphrases