Using Word Association Norms to Measure Corpus Representativeness.
Reinhard RappPublished in: CICLing (1) (2014)
Keyphrases
- word frequencies
- english words
- text corpus
- multiword
- pointwise mutual information
- lexical features
- word sense disambiguation
- co occurrence
- sentence level
- training corpus
- similarity measure
- recognizing textual entailment
- noun phrases
- manually annotated
- ambiguous words
- linguistic information
- related words
- distance measure
- text classification
- keywords
- information extraction
- word co occurrence
- parallel corpus
- news articles
- statistical machine translation
- text corpora
- natural language text
- test set