Align vision-language semantics by multi-task learning for multi-modal summarization.
Chenhao CuiXinnian LiangShuangzhi WuZhoujun LiPublished in: Neural Comput. Appl. (2024)
Keyphrases
- multi modal
- multi task learning
- multi task
- learning tasks
- gaussian processes
- learning problems
- multitask learning
- multiple tasks
- computer vision
- multi modality
- transfer learning
- high order
- theoretical analysis
- audio visual
- natural language
- high dimensional
- image annotation
- semantic information
- inductive learning
- gaussian process
- uni modal
- loss function
- logic programming
- pairwise