DIVE: End-to-end Speech Diarization via Iterative Speaker Embedding.
Neil ZeghidourOlivier TeboulDavid GrangierPublished in: CoRR (2021)
Keyphrases
- end to end
- speaker diarization
- speaker identification
- speech recognition
- speaker recognition
- broadcast news
- automatic speech recognition
- audio visual
- speaker verification
- speech signal
- gaussian mixture model
- speaker dependent
- noisy environments
- wireless ad hoc networks
- admission control
- multipath
- high bandwidth
- ad hoc networks
- internet protocol
- congestion control
- prosodic features
- vocal tract
- rate allocation
- motion estimation
- speaker adaptation
- real world
- synthesized speech
- transport layer
- word error rate
- speech synthesis
- content delivery
- web services