Audio-visual speech recognition using deep bottleneck features and high-performance lipreading.
Satoshi TamuraHiroshi NinomiyaNorihide KitaokaShin OsugaYurie IribeKazuya TakedaSatoru HayamizuPublished in: APSIPA (2015)
Keyphrases
- audio visual
- person authentication
- multi modal
- audio features
- multi stream
- visual information
- multimodal fusion
- emotion recognition
- visual data
- multimedia
- speaker verification
- audio visual speech recognition
- co occurrence
- feature set
- feature extraction
- human computer interaction
- data management
- data analysis
- metadata
- data sets