Using Word Familiarities and Word Associations to Measure Corpus Representativeness.
Reinhard RappPublished in: LREC (2014)
Keyphrases
- word frequencies
- unknown words
- n gram
- text corpus
- english words
- word pairs
- multiword
- noun phrases
- natural language text
- manually annotated
- sentence level
- similarity measure
- word recognition
- conversational speech
- word frequency
- related words
- statistical machine translation
- linguistic knowledge
- word segmentation
- text categorization
- language model
- keywords