ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection.
Thinh PhanKhoa VoDuy LeGianfranco DorettoDonald A. AdjerohNgan LePublished in: WACV (2024)
Keyphrases
- end to end
- language model
- action detection
- language modeling
- n gram
- probabilistic model
- computer vision
- action recognition
- temporal information
- information retrieval
- retrieval model
- test collection
- context sensitive
- query expansion
- atomic actions
- temporal reasoning
- space time
- spatio temporal
- mixture model
- object detection
- action classification
- pattern search
- graphical models
- human actions
- object recognition
- temporal relations