Multimodal attention-based transformer for video captioning.
Hemalatha MunusamyC. Chandra SekharPublished in: Appl. Intell. (2023)
Keyphrases
- multimedia
- video streams
- video clips
- real time
- video sequences
- video content
- video data
- multi modal
- fuzzy logic
- video frames
- video images
- real time video
- multimodal information
- key frames
- video database
- video analysis
- digital video
- video processing
- neural network
- video surveillance
- fault diagnosis
- story segmentation
- video retrieval
- dynamic scenes
- visual data
- feature vectors
- spatio temporal
- computer vision
- artificial intelligence
- visual saliency
- genetic algorithm