Finding Web Document Associations Using Frequent Pairs of Adjacent Words.
Jason Yong-Jin TeeLay-Ki SoonBali Ranaivo-MalançonPublished in: KTW (2011)
Keyphrases
- web documents
- semistructured documents
- document representation
- keywords
- n gram
- prefetching
- semi structured
- web pages
- tree structured patterns
- web search engines
- web content
- information extraction
- related words
- web data
- word pairs
- textual information
- pairwise
- databases
- focused crawling
- response time
- co occurrence
- website
- web logs
- xml documents
- dynamically generated
- document collections
- visual features