Face, Body, Voice: Video Person-Clustering with Multiple Modalities.
Andrew BrownVicky KalogeitonAndrew ZissermanPublished in: CoRR (2021)
Keyphrases
- multiple modalities
- multi modal
- multimedia
- imaging modalities
- multimedia data
- video search
- video content
- low level features
- human body
- video data
- image processing
- cross modal
- space time
- video frames
- video streams
- facial expressions
- video shots
- visual cues
- video database
- high dimensional data
- video sequences
- video analysis
- multimedia databases
- key frames
- gray level
- decision support system
- high resolution