Building a Part-of-Speech Tagged Corpus for Drenjongke (Bhutia).
Mana AshidaSeunghun LeeKunzang NamgyalPublished in: AACL/IJCNLP (Student Research Workshop) (2020)
Keyphrases
- part of speech
- pos tagging
- training corpus
- linguistic features
- multiword
- n gram
- noun phrases
- linguistic information
- penn treebank
- unknown words
- syntactic features
- chinese word segmentation
- natural language processing
- tree bank
- word sense
- word sense disambiguation
- unsupervised grammar induction
- syntactic categories
- pos taggers
- text documents
- parse tree
- dependency parsing
- word segmentation
- ambiguous words
- tf idf
- machine translation
- information retrieval
- knowledge representation
- feature vectors