Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits.
Aurélien BibautAntoine ChambazMark J. van der LaanPublished in: UAI (2020)
Keyphrases
- detection algorithm
- high accuracy
- dynamic programming
- times faster
- cost function
- optimization algorithm
- computationally efficient
- experimental evaluation
- objective function
- preprocessing
- computational complexity
- learning algorithm
- np hard
- significant improvement
- search space
- linear programming
- particle swarm optimization
- expectation maximization
- matching algorithm
- least squares
- worst case
- simulated annealing
- em algorithm
- data streams
- path planning