Multi-Modal Temporal Convolutional Network for Anticipating Actions in Egocentric Videos.
Olga ZatsarynnaYazan Abu FarhaJuergen GallPublished in: CVPR Workshops (2021)
Keyphrases
- multi modal
- convolutional network
- video search
- human actions
- convolutional neural networks
- human activities
- semantic concepts
- atomic actions
- video sequences
- multi modality
- activity recognition
- audio visual
- image annotation
- action recognition
- video frames
- video content
- high dimensional
- cross modal
- uni modal
- feature extraction