Aligning Words from Speech Recognition and Shots for Video Information Retrieval.
Yu-Jen ChengHsin-Hsi ChenPublished in: TRECVID (2004)
Keyphrases
- speech recognition
- information retrieval
- video data
- language model
- speech recognition systems
- key frames
- video sequences
- video streams
- n gram
- video shots
- video clips
- speech processing
- speech recognition errors
- hidden markov models
- video content
- language modeling
- speech synthesis
- news video
- spoken document retrieval
- speech recognizer
- video retrieval
- video frames
- video database
- speech signal
- pattern recognition
- video analysis
- automatic speech recognition
- speech recognition technology
- visual features
- information retrieval systems
- speaker identification
- keywords
- multimedia
- handwriting recognition
- digital video library
- visual content
- test collection
- text mining
- word segmentation
- document collections
- speech retrieval
- image processing
- visual data
- multimedia data
- text documents
- speaker independent
- speaker adaptation
- probabilistic model