Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation.
Da YinXiao LiuFan YinMing ZhongHritik BansalJiawei HanKai-Wei ChangPublished in: EMNLP (2023)
Keyphrases
- database
- data sets
- image data
- high quality
- data structure
- data processing
- data collection
- statistical methods
- data acquisition
- statistical analysis
- knowledge discovery
- data sources
- data points
- complex data
- original data
- sensor data
- application domains
- raw data
- historical data
- data quality
- noisy data
- data objects
- neural network
- multimedia data
- machine learning
- missing data
- multimedia
- computer systems
- training data
- high dimensional
- dimensionality reduction