Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts.
Tomasz StanislawekFilip GralinskiAnna WróblewskaDawid LipinskiAgnieszka KaliskaPaulina RosalskaBartosz TopolskiPrzemyslaw BiecekPublished in: CoRR (2021)
Keyphrases
- information extraction
- text documents
- free text
- web documents
- information retrieval
- database
- unstructured documents
- text mining
- textual data
- document classification
- real world
- natural language text
- natural language processing
- information retrieval systems
- document collections
- text analysis
- text collections
- high level
- precision and recall
- metadata
- text data
- resource intensive
- question answering
- unstructured text
- information extraction systems
- web mining
- structured data
- xml documents
- image retrieval
- natural language
- keywords
- data sets