ASR is All You Need: Cross-Modal Distillation for Lip Reading.
Triantafyllos AfourasJoon Son ChungAndrew ZissermanPublished in: ICASSP (2020)
Keyphrases
- cross modal
- lip reading
- automatic speech recognition
- multi modal
- speaker identification
- head tracking
- speech recognition
- noisy environments
- expression recognition
- multimedia retrieval
- speech signal
- image retrieval
- multimedia databases
- visual similarity
- visual data
- hidden markov models
- computer vision
- face recognition
- image sequences