Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost.
Sungjun ChoSeonwoo MinJinwoo KimMoontae LeeHonglak LeeSeunghoon HongPublished in: CoRR (2022)
Keyphrases
- historical data
- statistical models
- training data
- prior knowledge
- data sets
- statistical analysis
- data processing
- data structure
- data analysis
- image data
- data mining techniques
- high quality
- labeled data
- cost benefit analysis
- accurate models
- database
- learning models
- raw data
- data collection
- xml documents
- experimental data
- decision trees
- missing values
- high dimensional
- original data
- incomplete data
- knowledge discovery
- probabilistic model