DisMo: A Morphosyntactic, Disfluency and Multi-Word Unit Annotator. An Evaluation on a Corpus of French Spontaneous and Read Speech.
George ChristodoulidesMathieu AvanziJean-Philippe GoldmanPublished in: LREC (2014)
Keyphrases
- multiword
- spontaneous speech
- conversational speech
- part of speech
- lexical units
- context sensitive
- text clustering
- text segments
- automatic speech recognition
- speech recognition
- language model
- n gram
- knowledge discovery
- feature selection
- text documents
- spoken language
- broadcast news
- human machine interaction
- spoken document retrieval