Automatic Semantic Subject Indexing of Web Documents in Highly Inflected Languages.
Reetta SinkkiläOsma SuominenEero HyvönenPublished in: ESWC (1) (2011)
Keyphrases
- web documents
- information extraction
- semi structured
- web pages
- keywords
- web search engines
- link structure
- web content
- information retrieval
- textual information
- document representation
- database
- vector space model
- cross lingual
- focused crawling
- n gram
- indexing method
- text retrieval
- web data
- relational databases
- knowledge discovery
- similarity measure
- search engine
- structured data
- structured documents
- access methods
- html documents
- text mining