Speech Corpora Divergence Based Unsupervised Data Selection for ASR.
Changfeng GaoGaofeng ChengPengyuan ZhangYonghong YanPublished in: CoRR (2023)
Keyphrases
- data sets
- training data
- knowledge discovery
- databases
- data structure
- speech recognition
- synthetic data
- data collection
- data points
- data processing
- input data
- experimental data
- high quality
- image data
- statistical analysis
- data analysis
- spatial data
- data distribution
- original data
- data quality
- text mining
- data mining techniques
- database
- prior knowledge
- unsupervised learning
- pattern recognition
- text data
- text corpora