Using Apache Spark on Hadoop Clusters as Backend for WebLicht Processing Pipelines.
Soheila SahamiThomas EckartGerhard HeyerPublished in: CLARIN Annual Conference (2018)
Keyphrases
- back end
- open source
- data management
- big data
- map reduce
- user friendly
- clustering algorithm
- data processing
- building blocks
- data types
- cloud computing
- data sets
- data repositories
- open source software
- distributed systems
- information management
- parallel computation
- source code
- data structure
- open source projects
- database
- publish subscribe