Automatic Extraction of Subcorpora based on Subcategorization Frames from a Part-of-Speech Tagged Corpus.
Susanne GahlPublished in: COLING-ACL (1998)
Keyphrases
- automatic extraction
- part of speech
- pos tagging
- training corpus
- linguistic features
- multiword
- noun phrases
- n gram
- unknown words
- linguistic information
- penn treebank
- natural language processing
- syntactic features
- word sense
- natural language text
- tree bank
- pos taggers
- unsupervised grammar induction
- relation extraction
- word sense disambiguation
- chinese word segmentation
- syntactic categories
- parse tree
- word segmentation
- dependency parsing
- domain adaptation
- tf idf
- world knowledge
- text classification
- bayesian networks
- information extraction
- semi supervised
- language model
- ambiguous words
- machine translation