DIAGNOSE: Avoiding Out-of-distribution Data using Submodular Information Measures.
Suraj KothawadeAkshit SrivastavaVenkat IyerGanesh RamakrishnanRishabh K. IyerPublished in: CoRR (2022)
Keyphrases
- raw data
- huge amounts
- data sets
- database
- sensor data
- collected data
- complex data
- data processing
- original data
- data distribution
- information sources
- computer systems
- prior knowledge
- data sources
- data collection
- global information
- structural information
- data analysis
- stored data
- image data
- synthetic data
- high quality
- historical data
- heterogeneous data
- data repositories
- training data
- web data
- end users
- probability distribution
- background knowledge
- log data
- domain experts
- information extraction
- central processor
- website
- data mining tools
- heterogeneous sources
- information retrieval
- essential information
- complex structures
- private information
- multiple sources
- statistical tests
- temporal information
- spatial information
- data structure
- fault diagnosis
- input data
- data points
- domain knowledge