Audio-Visual Keyword Spotting Based on Multidimensional Convolutional Neural Network.
Runwei DingCheng PangHong LiuPublished in: ICIP (2018)
Keyphrases
- audio visual
- convolutional neural network
- keyword spotting
- multi modal
- face detection
- speech recognition
- hidden markov models
- visual information
- visual data
- speech processing
- printed documents
- neural network
- multimedia
- handwritten documents
- object detection
- pattern recognition
- multi stream
- text mining
- video data
- high dimensional data
- machine learning