Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward.

Lixing Chen Jie Xu Zhuo Lu

Published in: NeurIPS (2018)

Keyphrases

multi armed bandits
bandit problems
multi armed bandit
decision problems
contextual information
greedy algorithm
multi armed bandit problems
reinforcement learning
objective function
evolutionary algorithm
np hard
dynamic programming
expected utility