HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures.
Jiefeng MaJun DuPengfei HuZhenrong ZhangJianshu ZhangHuihui ZhuCong LiuPublished in: CoRR (2023)
Keyphrases
- high precision
- synthetic data
- neural network
- computational complexity
- computational cost
- feature set
- segmentation method
- surface model
- similarity measure
- objective function
- cost function
- dynamic programming
- information retrieval systems
- reconstruction process
- hierarchical model
- document clustering
- text documents
- detection method
- classification accuracy
- feature extraction
- three dimensional