Toward a Comparable Corpus of Latvian, Russian and English Tweets.
Dmitrijs MilajevsPublished in: BUCC@ACL (2017)
Keyphrases
- link grammar
- person names
- named entities
- open domain
- wide coverage
- statistical machine translation
- parallel corpus
- broad coverage
- english words
- social media
- penn treebank
- linguistic features
- multiword
- training corpus
- english language
- sentence pairs
- machine translation
- natural language
- unknown words
- hand crafted
- morphological analysis
- topic tracking
- machine translation system
- mono lingual
- word sense
- pos tagging
- cross lingual
- question answering
- language learning
- parallel corpora
- english text
- semantic roles
- answer questions
- news articles
- cross language information retrieval
- stop words
- broadcast news
- cross language
- natural language processing
- dependency parsing
- target language
- user generated content