Bimodal automatic speech segmentation based on audio and visual information fusion.
Eren AkdemirTolga ÇilogluPublished in: Speech Commun. (2011)
Keyphrases
- information fusion
- emotion recognition
- speech music discrimination
- visual information
- audio visual
- data fusion
- audio stream
- fusion algorithm
- visual data
- information gathering
- fusion method
- video indexing and retrieval
- visual speech
- soft computing
- multi source
- low level
- audio features
- automatic transcription
- visual features
- content based video retrieval
- speaker identification
- text to speech
- speech recognition
- fusion model
- machine learning
- decision level
- audio signals
- broadcast news
- image segmentation
- multiscale
- speaker diarization
- real time
- artificial intelligence
- dempster shafer evidence theory
- multi sensor
- fault diagnosis