DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping.
Yongrui ChenHaiyun JiangXinting HuangShuming ShiGuilin QiPublished in: NAACL-HLT (2024)
Keyphrases
- data sets
- computer systems
- training data
- database
- image data
- data quality
- raw data
- data points
- data sources
- information retrieval
- statistical analysis
- data extraction
- machine learning
- experimental data
- synthetic data
- high dimensional data
- text mining
- data structure
- end users
- data analysis
- data collection
- social networks
- relational databases
- keywords
- knowledge discovery
- missing data
- sensor data
- statistically significant
- data distribution
- probability distribution