Multi-Modal Temporal Convolutional Network for Anticipating Actions in Egocentric Videos.
Olga ZatsarynnaYazan Abu FarhaJuergen GallPublished in: CoRR (2021)
Keyphrases
- multi modal
- human activities
- convolutional network
- video search
- activity recognition
- human actions
- convolutional neural networks
- atomic actions
- multi modality
- video sequences
- semantic concepts
- audio visual
- image annotation
- uni modal
- computer vision
- video frames
- action recognition
- coarse to fine
- medical images
- high dimensional
- image segmentation