Content Extraction from News Pages Using Particle Swarm Optimization on Linguistic and Structural Features.
Cai-Nicolas ZieglerMichal SkubaczPublished in: Web Intelligence (2007)
Keyphrases
- structural features
- content extraction
- news pages
- web news
- digital archives
- structural information
- html documents
- multimedia information retrieval
- text content
- semantic features
- natural language processing
- natural language
- databases
- secondary structure
- feature set
- textual content
- digital libraries
- context aware
- domain knowledge
- multimedia
- data mining