Rep-MCA-former: An efficient multi-scale convolution attention encoder for text-independent speaker verification.
Xiaohu LiuDefu ChenXianbao WangSheng XiangXuwen ZhouPublished in: Comput. Speech Lang. (2024)
Keyphrases
- speaker verification
- multiscale
- noisy environments
- speaker recognition
- information retrieval
- language identification
- text mining
- prosodic features
- image processing
- edge detection
- bit rate
- keywords
- emotion recognition
- multi modal
- using artificial neural networks
- audio visual
- text data
- language model
- high dimensional
- image segmentation
- computer vision