Context-dependent audio-visual and temporal features fusion for TV commercial detection.
Bo ZhangJiancheng ZouBo XuPublished in: ISCAS (2013)
Keyphrases
- context dependent
- audio visual
- tv broadcast
- multi modal
- person authentication
- multimodal fusion
- visual information
- natural language
- temporal context
- visual data
- low level
- multi stream
- audio visual speech recognition
- multimedia
- visual features
- action recognition
- co occurrence
- face recognition
- machine learning
- data sets