Deep Music Retrieval for Fine-Grained Videos by Exploiting Cross-Modal-Encoded Voice-Overs.
Tingtian LiZixun SunHaoruo ZhangJin LiZiming WuHui ZhanYipeng YuHengcan ShiPublished in: CoRR (2021)
Keyphrases
- fine grained
- cross modal
- music retrieval
- multi modal
- music information retrieval
- audio features
- visual data
- access control
- video sequences
- image retrieval
- semantic features
- video content
- audio visual
- multimedia databases
- video analysis
- information retrieval
- video frames
- visual information
- visual features
- search engine