Counting Lumps in Word Space: Density as a Measure of Corpus Homogeneity.
Magnus SahlgrenJussi KarlgrenPublished in: SPIRE (2005)
Keyphrases
- probability measure
- word frequencies
- english words
- search space
- word pairs
- statistical machine translation
- text corpus
- similarity measure
- co occurrence
- natural language text
- parallel corpus
- pointwise mutual information
- space time
- training corpus
- concept space
- vector space
- document level
- n gram
- noun phrases
- multiword
- word frequency
- distance measure
- spontaneous speech
- word co occurrence
- spatial coordinates
- conversational speech
- sentence level
- word sense