Characterizing In-Text Citations Using N-Gram Distributions.
Marc BertinIana AtanassovaPublished in: ISSI (2015)
Keyphrases
- n gram
- language model
- character n grams
- word level
- scientific papers
- language independent
- bag of words
- text classification
- web documents
- language specific
- language modeling
- variable length
- language modelling
- scientific literature
- text documents
- information retrieval
- part of speech
- text mining
- viterbi algorithm
- word segmentation
- text retrieval
- databases
- information extraction
- machine learning