Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits.
Aurélien F. BibautAntoine ChambazMark J. van der LaanPublished in: CoRR (2020)
Keyphrases
- dynamic programming
- learning algorithm
- worst case
- preprocessing
- experimental evaluation
- optimal solution
- high accuracy
- cost function
- expectation maximization
- input data
- computationally efficient
- computational cost
- computational complexity
- path planning
- optimization algorithm
- detection algorithm
- k means
- search space
- ant colony optimization
- particle swarm optimization
- multi armed bandit
- theoretical analysis
- data sets
- linear programming
- probabilistic model
- np hard
- significant improvement
- objective function
- similarity measure
- genetic algorithm