Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost.
Sungjun ChoSeonwoo MinJinwoo KimMoontae LeeHonglak LeeSeunghoon HongPublished in: NeurIPS (2022)
Keyphrases
- data sets
- prior knowledge
- data processing
- raw data
- experimental data
- synthetic data
- data collection
- data analysis
- training data
- data points
- statistical analysis
- xml documents
- stochastic models
- accurate models
- high quality
- image data
- knowledge discovery
- data structure
- database
- input data
- missing data
- spatial data
- social networks
- statistical methods
- neural network
- original data
- historical data
- storage space
- stochastic processes
- learned models
- data sources