Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards.

Published in: CoRR (2022)

Keyphrases