Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents.
Ali KhodaeiCyrus ShahabiChen LiPublished in: DEXA (1) (2010)
Keyphrases
- web documents
- textual features
- web pages
- bag of words
- web search engines
- information extraction
- keywords
- web content
- spatial data
- semi structured
- web search
- visual features
- information retrieval
- link structure
- search engine
- content similarity
- html documents
- semantic association
- web mining
- active learning
- document representation
- data analysis
- object recognition
- unstructured documents