A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus.
Mohammed Salah Al-RadhiOmnia AbdoTamás Gábor CsapóSherif M. AbdouGéza NémethMervat FashalPublished in: Comput. Speech Lang. (2020)
Keyphrases
- audio visual
- speech synthesis
- multi modal
- text to speech
- visual information
- speech recognition
- speech corpus
- audio visual speech recognition
- multi stream
- visual data
- neural network
- emotion recognition
- speaker verification
- multimedia
- vocal tract
- person authentication
- nearest neighbor
- spatio temporal
- high dimensional
- image processing
- information retrieval
- machine learning