LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification.
Yiping SongJuhua ZhangZhiliang TianYuxin YangMinlie HuangDongsheng LiPublished in: CoRR (2024)
Keyphrases
- text classification
- data distribution
- data sets
- raw data
- knowledge discovery
- data processing
- data collection
- background knowledge
- data integration
- data mining techniques
- data points
- expert systems
- data analysis
- data sources
- private data
- data privacy
- data cleaning
- data quality
- original data
- domain experts
- database
- prior knowledge
- domain knowledge
- feature selection
- machine learning
- labeled data
- information loss
- sensitive data