End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection.
Yuki TakashimaYusuke FujitaShinji WatanabeShota HoriguchiPaola GarcíaKenji NagamatsuPublished in: CoRR (2021)
Keyphrases
- end to end
- speaker diarization
- speech recognition
- audio stream
- text localization and recognition
- broadcast news
- congestion control
- admission control
- application layer
- speaker identification
- speech activity detection
- feature space
- handwriting recognition
- real world
- video search
- speech signal
- neural network model
- computer vision
- machine learning