Research on Content Extraction of Rich Text Web Pages.
Hangfeng YangHui LuShudong LiMohan LiYanbin SunPublished in: ICAIS (4) (2019)
Keyphrases
- content extraction
- text content
- web pages
- html documents
- web news
- website
- search engine
- keywords
- web documents
- web search engines
- web search
- textual content
- digital archives
- web content
- multimedia information retrieval
- topic modeling
- link structure
- information retrieval systems
- topic models
- automatic extraction
- web graph
- text mining
- machine learning