Collecting and visualizing data lineage of Spark jobs.
Alexander SchoenenwaldSimon KernJosef ViehhauserJohannes SchildgenPublished in: Datenbank-Spektrum (2021)
Keyphrases
- data lineage
- fine grained
- processing times
- metadata
- job scheduling
- data collection
- parallel machines
- computational grids
- single machine scheduling problem
- identical machines
- flowshop
- preemptive scheduling
- release dates
- single machine
- scheduling problem
- precedence constraints
- identical parallel machines
- scheduling jobs
- scheduling strategy
- optimal scheduling
- parallel coordinates
- deteriorating jobs
- wireless sensor networks
- information systems