Content extraction from news web pages using tag tree.
Chandrakala AryaSanjay K. DwivediPublished in: Int. J. Auton. Comput. (2018)
Keyphrases
- content extraction
- dom tree
- html documents
- web news
- web pages
- text content
- keywords
- web documents
- news pages
- automatic extraction
- search engine
- web content
- structured documents
- website
- web search
- semi structured
- semantic information
- textual content
- social annotations
- news articles
- semistructured data
- xml documents
- link structure
- digital archives
- information extraction
- database systems
- databases