Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation.
Shuhe WangYuxian MengXiaofei SunFei WuRongbin OuyangRui YanTianwei ZhangJiwei LiPublished in: CoRR (2021)
Keyphrases
- multi modal
- video search
- mutual dependency
- cross modal
- multiple modalities
- multi modality
- auto annotation
- single modality
- information retrieval
- audio visual
- image processing
- visual information
- high dimensional
- image annotation
- semantic concepts
- text mining
- image analysis
- uni modal
- high level
- text retrieval
- image registration
- low level
- natural language