Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data.
Luis Sanchez TapiaAntonio GomezMario EsparzaVenkatesh JatlaMarios PattichisSylvia Celedón-PattichisCarlos López LeivaPublished in: CAIP (1) (2021)
Keyphrases
- speech recognition
- video data
- automatic speech recognition
- video sequences
- video streams
- video analysis
- language model
- video frames
- speaker dependent
- speech recognizer
- speaker identification
- speech signal
- hidden markov models
- video content
- multimedia
- pattern recognition
- video database
- noisy environments
- video retrieval
- speech recognition systems
- speech synthesis
- speaker independent
- cross lingual
- video clips
- three dimensional
- multimedia systems
- visual data
- key frames
- speaker diarization
- speaker adaptation
- speech retrieval
- parallel corpora
- machine translation
- speaker recognition
- acoustic models
- feature vectors
- information retrieval
- neural network