Multi-modal Auto-regressive Modeling via Visual Words.
Tianshuo PengZuchao LiLefei ZhangHai ZhaoPing WangBo DuPublished in: CoRR (2024)
Keyphrases
- multi modal
- visual words
- autoregressive
- bag of words
- moving average
- image representation
- image classification
- co occurrence
- image features
- visual content
- video search
- image retrieval
- scene classification
- keypoints
- image annotation
- spatial information
- semantic concepts
- text classification
- image processing
- high dimensional
- non stationary
- pairwise