TitleFinder: extracting the headline of news web pages based on cosine similarity and overlap scoring similarity.
Hadi MohammadzadehThomas GottronFranz SchweiggertGerhard HeyerPublished in: WIDM (2012)
Keyphrases
- cosine similarity
- web pages
- similarity measure
- similarity function
- distance measure
- document similarity
- similarity computation
- keywords
- euclidean distance
- vector space
- tf idf
- vector space model
- document clustering
- semantic similarity
- k means
- web documents
- search engine
- web search
- contextual information
- distance function
- web search engines
- pairwise