Detecting Text Reuse with Modified and Weighted N-grams.
Rao Muhammad Adeel NawabMark StevensonPaul D. CloughPublished in: *SEM@NAACL-HLT (2012)
Keyphrases
- n gram
- character n grams
- language model
- web documents
- word level
- variable length
- bag of words
- text classification
- language modelling
- language independent
- keywords
- language specific
- text retrieval
- language modeling
- information retrieval
- text mining
- word segmentation
- part of speech
- viterbi algorithm
- databases
- machine learning
- neural network
- named entities
- artificial intelligence
- web search