Exploring a Choctaw Language Corpus with Word Vectors and Minimum Distance Length.
Jacqueline BrixeyDavid J. SidesTimothy VizthumDavid R. TraumKhalil IskarousPublished in: LREC (2020)
Keyphrases
- minimum distance
- parallel corpus
- upper bound
- word frequencies
- machine translation system
- linguistic knowledge
- euclidean distance
- natural language
- nearest neighbor
- target language
- word sense
- natural language text
- distance measurement
- language specific
- unknown words
- sentence level
- convex polyhedra
- english words
- noun phrases
- feature vectors
- probabilistic context free grammars
- lexical features
- co occurrence
- word pairs
- error correcting codes
- multiword
- statistical machine translation
- source language
- training corpus
- cross language information retrieval
- text corpus
- image analysis
- machine translation
- convex hull
- word sense disambiguation
- cross lingual
- minimum distance classifier
- word segmentation
- bilingual dictionaries
- image segmentation
- text corpora
- dimensionality reduction
- lower bound
- pattern recognition