BRAVEn: Improving Self-supervised pre-training for Visual and Auditory Speech Recognition.
Alexandros HaliassosAndreas ZinonosRodrigo MiraStavros PetridisMaja PanticPublished in: ICASSP (2024)
Keyphrases
- speech recognition
- wall street journal corpus
- visual information
- isolated word
- hidden markov models
- acoustic models
- pattern recognition
- language model
- speech recognizer
- automatic speech recognition
- speech processing
- speaker identification
- visual speech
- visual features
- speech signal
- speech synthesis
- keyword spotting
- speech recognition technology
- signal processing
- speech understanding
- training process
- speech recognizers
- speech recognition systems
- speaker independent
- noisy environments
- visual data
- speaker dependent
- computer vision