Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation.
Marco GaidoMatteo NegriMauro CettoloMarco TurchiPublished in: CoRR (2021)
Keyphrases
- voice activity detection
- noisy environments
- speech recognition
- segmentation algorithm
- segmentation method
- level set
- image segmentation
- medical images
- multiscale
- edge detection
- audio stream
- speaker identification
- machine translation
- neural network
- noise reduction
- audio visual
- speaker verification
- fully unsupervised
- bayesian networks
- speech processing
- text to speech
- word segmentation
- emotion recognition
- graph cuts
- speech signal
- visual data
- multi modal