NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification.
Hyunjun HeoUi-Hyeop ShinRan LeeYoungju CheonHyung-Min ParkPublished in: CoRR (2023)
Keyphrases
- speaker verification
- multiscale
- noisy environments
- speaker recognition
- prosodic features
- spatio temporal
- image processing
- emotion recognition
- audio visual
- image representation
- edge detection
- multilayer perceptron
- image segmentation
- image sequences
- information retrieval
- probabilistic model
- face recognition
- language identification