A Large Dataset of Historical Japanese Documents with Complex Layouts.
Zejiang ShenKaixuan ZhangMelissa DellPublished in: CVPR Workshops (2020)
Keyphrases
- information retrieval systems
- database
- information retrieval
- document collections
- metadata
- text documents
- benchmark datasets
- resource intensive
- historical documents
- historical data
- query terms
- complex systems
- semantic information
- information extraction
- xml documents
- high level
- web documents
- synthetic datasets
- keywords
- japanese language