UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
Jiabo YeAnwen HuHaiyang XuQinghao YeMing YanGuohai XuChenliang LiJunfeng TianQi QianJi ZhangQin JinLiang HeXin LinFei HuangPublished in: EMNLP (Findings) (2023)
Keyphrases
- language model
- language understanding
- language modeling
- natural language understanding
- n gram
- probabilistic model
- document retrieval
- semantic interpretation
- language processing
- optical character recognition
- retrieval model
- speech recognition
- information retrieval
- dialogue system
- spoken dialogue systems
- test collection
- context sensitive
- query expansion
- natural language
- character recognition
- query terms
- mixture model
- document images
- smoothing methods
- ad hoc information retrieval
- vector space model
- natural language processing
- machine learning
- intelligent systems
- cognitive psychology
- knn
- translation model
- general knowledge
- active learning
- user interface
- word clouds