Representations of Language Varieties Are Reliable Given Corpus Similarity Measures.
Jonathan DunnPublished in: VarDial@EACL (2021)
Keyphrases
- similarity measure
- spanish language
- programming language
- language learning
- parallel corpus
- cost effective
- semantic representations
- similarity function
- linguistic knowledge
- language processing
- semantic similarity
- natural language
- meaning representations
- linguistic patterns
- computational linguistics
- co occurrence
- feature vectors
- euclidean distance
- test set
- mutual information
- data model