Lightweight recurrent cross-modal encoder for video question answering.
Steve Andreas ImmanuelCheol JeongPublished in: Knowl. Based Syst. (2023)
Keyphrases
- lightweight
- question answering
- cross modal
- multi modal
- visual data
- natural language processing
- video sequences
- natural language
- video data
- multimedia
- semantic concepts
- information retrieval
- information extraction
- video frames
- multimedia databases
- video content
- video streams
- video analysis
- wireless sensor networks
- image retrieval
- image data
- low level
- knowledge base
- multimedia data
- human actions
- video retrieval
- digital libraries