Content extraction from news web pages using tag tree.

Chandrakala Arya Sanjay K. Dwivedi

Published in: Int. J. Auton. Comput. (2018)

Keyphrases

content extraction
dom tree
html documents
web news
web pages
text content
keywords
web documents
news pages
automatic extraction
search engine
web content
structured documents
website
web search
semi structured
semantic information
textual content
social annotations
news articles
semistructured data
xml documents
link structure
digital archives
information extraction
database systems
databases