NBID Dataset: Towards Robust Information Extraction in Official Documents.
Lucas WojcikLuiz CoelhoRoger GranadaGustavo FührDavid MenottiPublished in: SIBGRAPI (2023)
Keyphrases
- information extraction
- free text
- information retrieval
- web documents
- text documents
- text mining
- unstructured documents
- precision and recall
- textual data
- unstructured text
- text analysis
- database
- xml documents
- information retrieval systems
- natural language processing
- named entity recognition
- legal documents
- semi structured
- text processing
- document classification
- data mining
- document retrieval
- question answering
- metadata
- document clustering
- retrieval systems
- user queries
- text data
- wordnet
- relational databases
- natural language text
- digital libraries
- machine learning