WebKE: Knowledge Extraction from Semi-structured Web with Pre-trained Markup Language Model.
Chenhao XieWenhao HuangJiaqing LiangChengsong HuangYanghua XiaoPublished in: CIKM (2021)
Keyphrases
- semi structured
- knowledge extraction
- language model
- web documents
- pre trained
- web data
- semi structured data
- content and structure
- language modeling
- structured data
- information extraction
- document retrieval
- probabilistic model
- data model
- textual documents
- knowledge discovery
- data mining
- information retrieval
- speech recognition
- retrieval model
- query expansion
- test collection
- unstructured data
- web pages
- text mining
- free text
- relevance model
- training data
- databases
- textual data
- pseudo relevance feedback
- web mining
- html documents
- database
- smoothing methods
- multimedia
- query terms
- bayesian networks
- document ranking
- training examples