Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation.
Wenliang DaiLu HouLifeng ShangXin JiangQun LiuPascale FungPublished in: CoRR (2022)
Keyphrases
- domain knowledge
- computer vision
- prior knowledge
- language learning
- real time
- knowledge acquisition
- information systems
- multimedia
- representation language
- knowledge management
- background knowledge
- specification language
- audio visual
- knowledge sources
- conceptual model
- knowledge sharing
- higher level
- multi modal
- programming language
- knowledge discovery
- knowledge representation
- expert systems
- natural language
- data sets