On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers.
Tianchu JiShraddhan JainMichael FerdmanPeter A. MilderH. Andrew SchwartzNiranjan BalasubramanianPublished in: ACL/IJCNLP (Findings) (2021)
Keyphrases
- high dimensional
- inference process
- probability distribution
- test statistic
- uniformly distributed
- distribution function
- sparse representation
- absolute difference
- extreme values
- attribute values
- user defined
- standard deviation
- quantization error
- parameter values
- probabilistic inference
- spatial distribution
- marginal distributions
- expert systems
- lookup table
- belief networks
- joint distribution
- data sets
- random variables
- graphical models
- motion estimation
- multiresolution
- neural network