Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation.
Dena F. MujtabaNihar R. MahapatraMegan ArneyJ. Scott YarussCaryn HerringJia BinPublished in: CoRR (2024)
Keyphrases
- fine tuning
- data sets
- synthetic data
- experimental data
- data processing
- learning algorithm
- background knowledge
- prior knowledge
- viable alternative
- training data
- speech recognition
- data analysis
- input data
- statistical analysis
- database
- image data
- supervised learning
- neural network
- real world
- audio stream
- audio visual
- learning models
- data quality
- original data
- machine learning
- feature selection
- data structure
- learning process
- xml documents
- computer systems
- data collection
- end users