Selective Listening by Synchronizing Speech With Lips.
Zexu PanRuijie TaoChenglin XuHaizhou LiPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2022)
Keyphrases
- synthesized speech
- speech recognition
- visual speech
- speech signal
- audio visual
- endpoint detection
- face recognition
- text to speech
- noisy environments
- recognition engine
- spontaneous speech
- speech synthesis
- audio signals
- speaker verification
- spoken language
- data sets
- human faces
- non stationary
- hidden markov models
- computer vision
- neural network