Speaker diarisation using 2D self-attentive combination of embeddings.

Guangzhi Sun Chao Zhang Philip C. Woodland

Published in: CoRR (2019)

Keyphrases

real time
data sets
neural network
dimensionality reduction
low dimensional
vector space
speaker recognition
database
artificial intelligence
search engine
computer vision
image sequences
visual features
speech recognition
speaker verification