Resolving the Imbalance Issue in Hierarchical Disciplinary Topic Inference via LLM-based Data Augmentation.
Xunxin CaiMeng XiaoZhiyuan NingYuanchun ZhouPublished in: CoRR (2023)
Keyphrases
- data sets
- synthetic data
- statistical analysis
- data collection
- data processing
- database
- small number
- original data
- data structure
- high quality
- training data
- inference process
- xml documents
- log data
- complex data
- databases
- raw data
- test data
- data distribution
- sensor data
- decision trees
- data mining techniques
- data analysis
- knowledge discovery
- probabilistic model
- data sources
- data model