TabLeX: A Benchmark Dataset for Structure and Content Information Extraction from Scientific Tables.

Harsh Desai Pratik Kayal Mayank Singh

Published in: ICDAR (2) (2021)

Keyphrases

benchmark datasets
information extraction
natural language processing
data mining
text mining
content and structure
database
web documents
semantic structure
probabilistic model
precision and recall
machine learning
knowledge discovery
multimedia
conditional random fields
named entities
multimedia data
free text
natural language text