Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation.
Xun WuShaohan HuangFuru WeiPublished in: CoRR (2024)
Keyphrases
- language model
- image generation
- language modeling
- information retrieval
- document retrieval
- probabilistic model
- n gram
- query expansion
- speech recognition
- retrieval model
- test collection
- high resolution
- context sensitive
- mixture model
- text retrieval
- smoothing methods
- multiword
- digital imaging
- translation model
- ad hoc information retrieval
- query terms
- text documents
- text mining
- computer vision
- generative model
- machine learning