BioPig: a Hadoop-based analytic toolkit for large-scale sequence data.
Henrik NordbergKaran BhatiaKai WangZhong WangPublished in: Bioinform. (2013)
Keyphrases
- sequence data
- open source
- sequence classification
- sequence analysis
- data intensive
- real world
- cloud computing
- biological sequences
- profile hidden markov models
- massive scale
- big data
- cloud computing platform
- nucleotide sequences
- binding sites
- data analytics
- genome sequences
- sequential data
- mapreduce framework
- regulatory elements
- map reduce
- space time
- biological information
- distributed systems