Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data.
Luis Sanchez TapiaAntonio GomezMario EsparzaVenkatesh JatlaMarios PattichisSylvia Celedón-PattichisCarlos LopezLeivaPublished in: CoRR (2021)
Keyphrases
- speech recognition
- video data
- automatic speech recognition
- video analysis
- video streams
- video sequences
- hidden markov models
- speaker dependent
- video retrieval
- language model
- multimedia
- pattern recognition
- speech recognizer
- speech synthesis
- video database
- speech signal
- speaker identification
- video frames
- three dimensional
- video content
- noisy environments
- machine translation
- video clips
- cross lingual
- speaker independent
- speaker diarization
- speech recognition systems
- acoustic models
- multimedia systems
- speaker recognition
- key frames
- speaker adaptation
- cross language information retrieval
- news video