Protocol for collecting a corpus of spontaneous, conversational, hispanic English.
Eva KnodtJared BernsteinOgnjen TodicPublished in: LREC (1998)
Keyphrases
- conversational speech
- broadcast news
- link grammar
- automatic speech recognition
- spontaneous speech
- spoken language
- natural language
- person names
- wide coverage
- statistical machine translation
- parallel corpus
- lightweight
- broad coverage
- multi party
- multi modal
- spoken document retrieval
- english language
- open domain
- communication protocol
- machine translation
- pos tagging
- linguistic features
- conversational agent
- cross lingual
- cross language
- english words
- language learning
- mono lingual
- multiword
- semantic roles
- cryptographic protocols
- tcp ip
- authentication protocol
- training corpus
- machine translation system
- data collection
- penn treebank
- information retrieval
- unknown words
- human machine interaction
- speech recognition
- target language
- cross language information retrieval