SIMARA: a database for key-value information extraction from full pages.
Solène TarrideMélodie BoilletJean-François MouffletChristopher KermorvantPublished in: CoRR (2023)
Keyphrases
- database
- information extraction
- databases
- web documents
- relational databases
- database applications
- data management
- website
- machine learning
- web pages
- search engine
- neural network
- conditional random fields
- named entities
- relational learning
- named entity recognition
- web server
- web mining
- web databases
- text mining
- query language
- data model
- artificial intelligence
- data sets