TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency.
Medhini NarasimhanArsha NagraniChen SunMichael RubinsteinTrevor DarrellAnna RohrbachCordelia SchmidPublished in: CoRR (2022)
Keyphrases
- image content
- cross modal
- instructional videos
- visual content
- image retrieval
- visual similarity
- visual data
- relevance feedback
- multi modal
- semantic concepts
- visual features
- data warehouse
- multimedia retrieval
- visual recognition
- information retrieval
- key frames
- multimedia databases
- content analysis
- test collection
- feature space
- video sequences
- computer vision