LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding.
Yang XuYiheng XuTengchao LvLei CuiFuru WeiGuoxin WangYijuan LuDinei A. F. FlorêncioCha ZhangWanxiang CheMin ZhangLidong ZhouPublished in: ACL/IJCNLP (1) (2021)
Keyphrases
- multi modal
- document understanding
- designing effective
- automatic text summarization
- cross modal
- training set
- automatic summarization
- multi modality
- document clustering
- multi document summarization
- image annotation
- knowledge discovery
- recommender systems
- high dimensional
- information filtering
- image search
- information retrieval
- feature space
- high level