Semi-structured document extraction based on document element block model.
Tao LvJiang LiuFan LuPeng ZhangXinyan WangCong WangPublished in: CCIS (2016)
Keyphrases
- semi structured
- web documents
- structured data
- information extraction
- information retrieval systems
- document collections
- document images
- relevant documents
- semi structured documents
- information retrieval
- web data extraction
- content and structure
- probabilistic model
- keywords
- retrieval systems
- data extraction
- semi structured data
- html pages
- data model