Automated Detection of Health Websites' HONcode Conformity: Can N-gram Tokenization Replace Stemming?
Célia BoyerLjiljana DolamicNatalia GrabarPublished in: MedInfo (2015)
Keyphrases
- n gram
- automated detection
- character n grams
- automated analysis
- language model
- variable length
- lung cancer
- language independent
- ultrasonic images
- viterbi algorithm
- part of speech
- language modeling
- text classification
- word segmentation
- language modelling
- web pages
- web documents
- language specific
- information retrieval
- information retrieval systems