ProPOSEC: A Prosody and PoS Annotated Spoken English Corpus.
Claire BrierleyEric AtwellPublished in: LREC (2010)
Keyphrases
- linguistic features
- manually annotated
- training corpus
- unknown words
- hand crafted
- part of speech
- text to speech
- pos tagging
- annotated corpus
- link grammar
- spoken language
- statistical machine translation
- person names
- penn treebank
- open domain
- spontaneous speech
- broad coverage
- named entities
- english words
- parallel corpus
- relation extraction
- wide coverage
- genia corpus
- text classification
- speech recognition
- parse tree
- natural language processing
- n gram
- manually constructed
- broadcast news
- conversational speech
- english language
- speech synthesis
- named entity recognition
- semantic analysis
- cross language information retrieval
- natural language
- machine translation
- multiword
- semantic features
- semantic roles
- cross language
- mono lingual
- sentence pairs
- language learning
- word sense disambiguation
- translation model
- word sense
- probabilistic model
- source language
- tree bank
- syntactic information
- dialogue system
- co occurrence
- word segmentation
- question answering
- answer questions
- english text
- automatic speech recognition
- cross lingual
- comparable corpora
- query translation