Multi-modal Feature Fusion Based on Variational Autoencoder for Visual Question Answering.
Liqing ChenYifan ZhuoYingjie WuYilei WangXianghan ZhengPublished in: PRCV (2) (2019)
Keyphrases
- multi modal
- question answering
- feature fusion
- video search
- feature extraction
- natural language processing
- information retrieval
- information extraction
- natural language
- multiple features
- visual information
- image segmentation
- visual features
- audio visual
- high dimensional
- artificial intelligence
- image annotation
- feature selection
- multiscale
- action recognition
- knowledge discovery
- low level
- learning algorithm
- machine learning