Keyphrases
- web documents
- web data
- multilingual documents
- website
- web information
- information retrieval
- web mining
- web applications
- document repositories
- digital documents
- content similarity
- document collections
- text information
- database
- topic specific
- web content
- keywords
- document retrieval
- web pages
- information sources
- information retrieval systems
- newspaper articles
- electronic documents
- textual data
- web resources
- document classification
- web users
- google scholar
- retrieval systems
- digital libraries
- structured information
- web crawler
- meta information
- multimedia documents
- web queries
- linked data
- text categorization
- information extraction
- xml documents
- metadata
- extensible markup language
- current web search engines
- focused crawling
- web directories
- search interface
- vector space model
- user interests
- text documents
- user behavior