An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition.
Takami YoshidaKazuhiro NakadaiHiroshi G. OkunoPublished in: IEA/AIE (1) (2010)
Keyphrases
- audio visual
- automatic speech recognition
- voice activity detection
- noisy environments
- speech recognition
- audio visual speech recognition
- multi modal
- speaker verification
- speech signal
- visual information
- hidden markov models
- visual data
- broadcast news
- conversational speech
- multimedia
- multi stream
- emotion recognition
- speaker identification
- audio features
- language model
- image processing
- passage retrieval
- contextual information
- acoustic features
- pattern recognition
- information retrieval