Improving Translation Lexicon Induction from Monolingual Corpora via Dependency Contexts and Part-of-Speech Equivalences.
Nikesh GareraChris Callison-BurchDavid YarowskyPublished in: CoNLL (2009)
Keyphrases
- part of speech
- machine translation
- natural language processing
- statistical machine translation
- training corpus
- chinese english
- pos tagging
- parallel corpus
- word sense disambiguation
- n gram
- grammar induction
- cross language information retrieval
- natural language
- multiword
- machine translation system
- bilingual dictionaries
- language independent
- query translation
- target language
- parallel corpora
- translation model
- cross lingual
- pos taggers
- question answering
- information extraction
- source language
- word alignment
- domain specific
- comparable corpora
- syntactic categories
- wordnet
- cross language
- language model
- machine learning
- noun phrases
- out of vocabulary
- syntactic information
- lexical information
- query expansion
- parse tree
- text mining
- tf idf
- named entity recognition
- text documents
- named entities
- ambiguous words