Employing Clustering Techniques for Automatic Information Extraction From HTML Documents.
Fatima AshrafTansel ÖzyerReda AlhajjPublished in: IEEE Trans. Syst. Man Cybern. Part C (2008)
Keyphrases
- html documents
- information extraction
- web documents
- semi structured
- web information extraction
- clustering method
- clustering algorithm
- web page retrieval
- information retrieval
- web pages
- natural language processing
- semantic information
- semi automatic
- text mining
- web content
- data extraction
- automatic extraction
- structured documents
- structured data
- document clustering
- xml documents
- vector space model
- data mining
- data analysis
- machine learning