DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding.
Zilong WangMingjie ZhanXuebo LiuDing LiangPublished in: EMNLP (Findings) (2020)
Keyphrases
- pairwise
- classification method
- tree structure
- cost function
- clustering method
- significant improvement
- computational complexity
- multimedia
- preprocessing
- data sets
- high accuracy
- synthetic data
- evolutionary algorithm
- keywords
- experimental evaluation
- classification accuracy
- support vector machine
- multiscale
- web documents
- detection method
- neural network