Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention.
Katsuyuki NakamuraHiroki OhashiMitsuhiro OkadaPublished in: CoRR (2021)
Keyphrases
- real time
- visual saliency
- dynamic environments
- video data
- multimedia
- video sequences
- modal logic
- video frames
- real time video
- video segmentation
- video content
- sensor data
- activity recognition
- video streams
- temporal information
- event detection
- visual features
- video database
- sensor networks
- dynamic textures
- neural network
- data sets