Improving speaker identification in TV-shows using person name detection in overlaid text and speech.
Delphine CharletCorinne FredouilleGéraldine DamnatiGrégory SenayPublished in: INTERSPEECH (2013)
Keyphrases
- speaker identification
- noisy environments
- speech recognition
- speech signal
- tv shows
- gaussian mixture model
- closed captions
- broadcast news
- topic segmentation
- feature extraction
- video clips
- human interactions
- automatic speech recognition
- text mining
- noise reduction
- pattern recognition
- mixture model
- information retrieval
- computer vision
- video streams
- video search
- human interaction
- non stationary
- video data
- visual features
- video sequences
- keywords
- search engine