EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts.
Yucheng HanRui WangChi ZhangJuntao HuPei ChengBin FuHanwang ZhangPublished in: CoRR (2024)
Keyphrases
- multi modal
- auto annotation
- diffusion model
- multiple modalities
- image data
- uni modal
- multi modality
- video search
- single modality
- edge detection
- image retrieval
- image collections
- image representation
- image content
- image classification
- image segmentation
- low level
- image analysis
- multiscale
- image processing
- image annotation
- text mining
- web images
- cross modal
- high resolution
- motion field
- diffusion process
- object recognition
- image sequences