SgLFT: Semantic-guided Late Fusion Transformer for video corpus moment retrieval.
Tongbao ChenWenmin WangMinglu ZhaoRuochen LiZhe JiangCheng YuPublished in: Neurocomputing (2024)
Keyphrases
- late fusion
- cross media
- video indexing
- concept detection
- semantic concepts
- multimedia
- visual features
- multimedia information
- video data
- video retrieval
- image retrieval
- video analysis
- semantic content
- video shots
- image annotation
- low level features
- action classification
- semantic similarity
- video content
- multi modal
- key frames
- visual concepts
- video sequences
- multimedia documents
- semantic information
- video database
- multimedia data
- video segments
- multimedia content
- video surveillance
- video streams
- visual content
- cross language
- news video
- manually annotated
- action recognition
- information retrieval
- image database
- event detection
- video frames
- bag of words
- high level
- keywords
- human actions
- natural language
- digital content
- content based retrieval
- image collections
- text retrieval
- multi label
- relevance feedback