A Transformer-based Late-Fusion Mechanism for Fine-Grained Object Recognition in Videos.
Jannik KochStefan WolfJürgen BeyererPublished in: WACV (Workshops) (2023)
Keyphrases
- fine grained
- late fusion
- object recognition
- video indexing
- coarse grained
- video data
- access control
- visual features
- cross media
- computer vision
- image understanding
- video analysis
- video retrieval
- video frames
- video sequences
- image retrieval
- video database
- action classification
- video surveillance
- image classification
- concept detection
- bag of words
- video segments
- feature space