TabLeX: A Benchmark Dataset for Structure and Content Information Extraction from Scientific Tables.
Harsh DesaiPratik KayalMayank SinghPublished in: CoRR (2021)
Keyphrases
- benchmark datasets
- information extraction
- natural language processing
- artificial intelligence
- multimedia
- logical structure
- machine learning
- data mining
- conditional random fields
- pedestrian detection
- semantic structure
- database
- metadata
- co occurrence
- web documents
- tree structure
- text documents
- structural information
- web content
- graph structure