Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention Layers.
Juntae KimJeehye LeePublished in: INTERSPEECH (2022)
Keyphrases
- recurrent neural networks
- domain specific
- nearest neighbor
- high dimensional
- information retrieval
- multimedia
- digital video
- domain independent
- cross domain
- audio video
- neural network
- cross modal
- focus of attention
- sparse data
- multi layer
- feed forward
- visual attention
- domain ontology
- domain experts
- back propagation
- genetic algorithm
- machine learning