Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models.
Shucong ZhangErfan LoweimiPeter BellSteve RenalsPublished in: Interspeech (2021)
Keyphrases
- high accuracy
- computationally efficient
- modeling method
- linear model
- machine learning methods
- detection method
- probabilistic model
- neural network
- high precision
- support vector machine svm
- fuzzy logic
- significant improvement
- prior knowledge
- preprocessing
- monte carlo
- human head
- real time
- parametric models
- learning algorithm
- monte carlo simulation
- image segmentation
- eye tracking
- feature extraction
- dynamic programming
- statistical model
- segmentation method
- similarity measure
- clustering method
- cost function