Cross-domain Multi-modal Few-shot Object Detection via Rich Text.
Zeyu ShangguanDaniel SeitaMohammad RostamiPublished in: CoRR (2024)
Keyphrases
- multi modal
- cross domain
- object detection
- video search
- sentiment classification
- multiple modalities
- multi modality
- domain adaptation
- transfer learning
- knowledge transfer
- information retrieval
- cross modal
- computer vision
- audio visual
- target domain
- text categorization
- text mining
- text documents
- video content
- high dimensional
- video sequences
- keywords
- uni modal
- humanoid robot
- image annotation
- natural language processing
- low level
- object recognition
- high level
- learning algorithm