BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition.
Alexandros HaliassosAndreas ZinonosRodrigo MiraStavros PetridisMaja PanticPublished in: CoRR (2024)
Keyphrases
- speech recognition
- visual information
- wall street journal corpus
- isolated word
- hidden markov models
- speech synthesis
- acoustic models
- language model
- automatic speech recognition
- speech processing
- pattern recognition
- speaker independent
- speech signal
- speech recognition systems
- speech recognizer
- speaker identification
- speech recognition technology
- visual features
- speech understanding
- signal processing
- noisy environments
- speaker dependent
- visual speech
- keyword spotting
- machine learning
- training process
- natural language processing
- feature vectors
- speech recognizers
- discriminative training
- image classification
- training set
- image processing