SpiCE: A New Open-Access Corpus of Conversational Bilingual Speech in Cantonese and English.
Khia A. JohnsonMolly BabelIvan FongNancy YiuPublished in: LREC (2020)
Keyphrases
- open access
- parallel corpus
- conversational speech
- broadcast news
- spoken language
- sentence pairs
- language resources
- parallel corpora
- statistical machine translation
- machine translation
- spontaneous speech
- chinese english
- cross lingual
- multiword
- finite state transducers
- comparable corpora
- machine translation system
- automatic speech recognition
- cross language information retrieval
- metadata
- cross language
- query translation
- word alignment
- speech recognition
- out of vocabulary
- text to speech
- english chinese
- language independent
- natural language
- target language
- word pairs
- bilingual dictionaries
- word level
- source language
- language learners
- translation model
- language model
- spoken document retrieval
- digital libraries
- database