Learning to Plan Variable Length Sequences of Actions with a Cascading Bandit Click Model of User Feedback.

Published in: AISTATS (2022)

Keyphrases