Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization.
Dongmei WangXiong XiaoNaoyuki KandaTakuya YoshiokaJian WuPublished in: CoRR (2022)
Keyphrases
- end to end
- voice activity detection
- speaker diarization
- speech recognition
- noisy environments
- admission control
- speaker identification
- network architecture
- multipath
- wireless ad hoc networks
- high bandwidth
- neural network
- speaker verification
- transport layer
- content delivery
- rate allocation
- internet protocol
- packet loss rate
- automatic speech recognition