UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
Jiabo YeAnwen HuHaiyang XuQinghao YeMing YanGuohai XuChenliang LiJunfeng TianQi QianJi ZhangQin JinLiang HeXin Alex LinFei HuangPublished in: CoRR (2023)
Keyphrases
- language model
- language understanding
- language modeling
- natural language understanding
- n gram
- document retrieval
- language processing
- probabilistic model
- optical character recognition
- information retrieval
- test collection
- query expansion
- spoken dialogue systems
- ad hoc information retrieval
- semantic interpretation
- retrieval model
- speech recognition
- character recognition
- context sensitive
- cognitive psychology
- document images
- natural language
- dialogue system
- vector space model
- general knowledge
- smoothing methods
- multimedia
- tf idf
- recommender systems