Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification.
Jingyu LiYusheng TianTan LeePublished in: ICASSP (2023)
Keyphrases
- speaker verification
- speaker recognition
- language identification
- noisy environments
- prosodic features
- audio visual
- emotion recognition
- text mining
- information retrieval
- image processing
- text to speech
- multi modal
- semantic information
- multiscale
- multimedia
- neural network
- multilayer perceptron
- feature extraction
- high level
- computer vision