Zero-Cost, Arrow-Enabled Data Interface for Apache Spark.
Sebastiaan Alvarez RodriguezJayjeet ChakrabrotyAaron ChuIvo JimenezJeff LeFevreCarlos MaltzahnAlexandru UtaPublished in: IEEE BigData (2021)
Keyphrases
- data sets
- knowledge discovery
- data collection
- data points
- high quality
- database
- complex data
- training data
- synthetic data
- missing data
- raw data
- original data
- data analysis
- neural network
- data quality
- high dimensional data
- data objects
- historical data
- data distribution
- computer systems
- probability distribution
- data sources