uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation via Large-Scale Pseudo Labelling.
Abdul WaheedKarima KadaouiMuhammad Abdul-MageedPublished in: CoRR (2024)
Keyphrases
- raw data
- prior knowledge
- synthetic data
- data processing
- knowledge discovery
- data points
- statistical analysis
- data collection
- image data
- database
- data sets
- data analysis
- knowledge representation
- training data
- data mining techniques
- domain experts
- bayesian networks
- data quality
- expert knowledge
- data mining tools
- data distribution
- background knowledge
- multiple sources
- high dimensional data
- knowledge management
- small number
- data sources
- metadata
- knowledge base
- learning algorithm
- real world