SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages.
René GrohNina GoesAndreas M. KistPublished in: CoRR (2024)
Keyphrases
- cross lingual
- text classification
- language independent
- cross lingual information retrieval
- machine translation
- spoken language
- multi lingual
- cross language
- language modeling
- european languages
- event extraction
- automatic speech recognition
- feature selection
- statistical machine translation
- machine translation system
- linguistic resources
- machine learning
- transfer learning
- translation model
- language model
- parallel corpus
- language specific
- indian languages
- decision trees
- query translation
- generative model
- active learning
- parallel corpora
- feature vectors
- training set