TabLeX: A Benchmark Dataset for Structure and Content Information Extraction from Scientific Tables.
Harsh DesaiPratik KayalMayank SinghPublished in: ICDAR (2) (2021)
Keyphrases
- benchmark datasets
- information extraction
- natural language processing
- data mining
- text mining
- content and structure
- database
- web documents
- semantic structure
- probabilistic model
- precision and recall
- machine learning
- knowledge discovery
- multimedia
- conditional random fields
- named entities
- multimedia data
- free text
- natural language text