Data and its (dis)contents: A survey of dataset development and use in machine learning research.
Amandalynne PaulladaInioluwa Deborah RajiEmily M. BenderEmily DentonAlex HannaPublished in: CoRR (2020)
Keyphrases
- machine learning
- database
- data sets
- data quality
- data analysis
- knowledge discovery
- original data
- statistical methods
- synthetic data
- data processing
- data points
- high quality
- training data
- case study
- training dataset
- data distribution
- missing data
- data collection
- image data
- relational databases
- data sources
- data mining techniques
- small number
- prior knowledge
- machine learning methods
- multimedia
- learning algorithm
- development process
- data mining