Generating a bilingual lexical corpus using interlanguage normalized Levenshtein distances.
Amanda Post da SilveiraJan-Willem van LeussenPublished in: ICPhS (2015)
Keyphrases
- word pairs
- lexical features
- parallel corpora
- parallel corpus
- hamming distance
- sentence pairs
- chinese english
- multiword
- bilingual dictionaries
- recognizing textual entailment
- wordnet
- linguistic information
- machine translation
- machine readable dictionaries
- natural language text
- statistical machine translation
- comparable corpora
- distance function
- distance measure
- query translation
- syntactic features
- cross lingual
- edit distance
- machine translation system
- parallel texts
- natural language processing
- domain specific
- lexical information
- similarity measure
- linguistic features
- cross language information retrieval
- knowledge base
- english chinese
- topic models
- word alignment
- euclidean distance
- semantic network
- text corpora