Developing High-resolution Universal Multy-type n-gram Text Similarity Detector.
Yurii PalkovskiiAlexei BelovPublished in: CLEF (Working Notes) (2014)
Keyphrases
- n gram
- high resolution
- character n grams
- language model
- text classification
- language independent
- web documents
- low resolution
- variable length
- bag of words
- word level
- language modeling
- distance measure
- language specific
- text mining
- language modelling
- information retrieval
- super resolution
- part of speech
- text documents
- text retrieval
- keywords
- semantic similarity
- viterbi algorithm
- statistical language modeling
- cross lingual
- word pairs
- neural network
- logic programs