StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond.
Pengyuan LyuYulin LiHao ZhouWeihong MaXingyu WanQunyi XieLiang WuChengquan ZhangKun YaoErrui DingJingdong WangPublished in: CoRR (2024)
Keyphrases
- language model
- information retrieval
- language modeling
- n gram
- document retrieval
- image segmentation
- image retrieval
- language modelling
- image classification
- image content
- query expansion
- retrieval model
- text retrieval
- speech recognition
- context sensitive
- image representation
- ad hoc information retrieval
- language model for information retrieval
- prior information
- query terms
- test collection
- probabilistic model
- bayesian networks
- mixture model
- web search
- relevance model
- smoothing methods
- text mining