A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations.
Sohan RudraSaksham GoelAnirban SantaraClaudio GentileLaurent PerronFei XiaVikas SindhwaniCarolina ParadaGaurav AggarwalPublished in: CoRR (2022)