Part of Speech Annotation of Intermediate Versions in the Keystroke Logged Translation Corpus.
Tatiana SerbinaPaula NiemietzMatthias FrickePhilipp MeisenStella NeumannPublished in: LAW@NAACL-HLT (2015)
Keyphrases
- part of speech
- training corpus
- pos tagging
- multiword
- machine translation
- linguistic features
- statistical machine translation
- noun phrases
- n gram
- penn treebank
- natural language processing
- unknown words
- linguistic information
- word sense disambiguation
- tree bank
- chinese word segmentation
- target language
- syntactic features
- word sense
- active learning
- metadata
- pos taggers
- text documents
- dependency parsing
- information extraction
- parallel corpora
- source language
- unsupervised grammar induction
- syntactic categories
- word segmentation
- parse tree
- cross language information retrieval
- knowledge discovery
- image retrieval
- machine translation system
- translation model
- web documents
- wordnet
- text classification
- co occurrence
- text mining
- knowledge representation