Annotation of prominent words, prosodic boundaries and segmental lengthening by non-expert transcribers in the Spoken Dutch Corpus.
Jeska BuhmannJohanneke CaspersVincent J. van HeuvenHeleen HoekstraJean-Pierre MartensMarc SwertsPublished in: LREC (2002)
Keyphrases
- spontaneous speech
- speech recognition
- english words
- prosodic features
- conversational speech
- word frequencies
- annotated corpus
- spoken words
- automatic speech recognition
- hidden markov models
- spoken language
- text corpora
- word pairs
- multiword
- training corpus
- unknown words
- linguistic information
- hand crafted
- text corpus
- linguistic features
- n gram
- person names
- word sense
- lexical features
- related words
- world knowledge
- word co occurrence
- manually annotated
- automatic annotation
- word sense disambiguation
- language model
- domain experts
- image annotation
- word frequency
- parallel corpus
- text mining
- noun phrases
- keywords
- broadcast news
- pos tagging
- active learning
- writing style
- statistical machine translation
- metadata
- automatic image annotation
- text to speech synthesis
- semantic relations
- speech signal
- relation extraction
- natural language text
- wikipedia articles