How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design.
Maria-Florina BalcanDan F. DeBlasioTravis DickCarl KingsfordTuomas SandholmEllen VitercikPublished in: STOC (2021)
Keyphrases
- learning algorithm
- noisy data
- times faster
- input data
- data driven
- synthetic datasets
- computational cost
- data reduction
- data structure
- incremental algorithms
- single pass
- theoretical analysis
- incomplete data
- computational complexity
- dimensional data
- computational efficiency
- space complexity
- worst case
- algorithms require
- data sets
- benchmark problems
- iterative algorithms
- computationally efficient
- detection algorithm
- related algorithms
- k means
- classification algorithm
- data mining techniques
- image processing algorithms
- np hard
- data analysis
- preprocessing
- significant improvement
- memory requirements
- training data
- large scale data sets
- objective function
- data mining algorithms
- dense regions
- high efficiency
- dynamic programming
- apriori algorithm
- convergence rate
- data clustering
- high dimensional data
- clustering method
- evolutionary algorithm
- maximum flow
- search space
- relevant attributes
- data mining