SparkGC: Spark based genome compression for large collections of genomes.
Haichang YaoGuangYong HuShangdong LiuHouzhi FangYimu JiPublished in: BMC Bioinform. (2022)
Keyphrases
- phylogenetic analysis
- human genome
- sequence analysis
- sequence data
- comparative genomics
- genome sequences
- genomic sequences
- sequenced genomes
- genome rearrangements
- compression algorithm
- image compression
- escherichia coli
- data compression
- dna sequences
- genome wide
- genotype phenotype
- molecular biology
- genomic data
- digital libraries
- information retrieval
- compression scheme
- compression ratio
- metadata
- comparative analysis
- sequence alignment
- computational biology
- genome scale
- data collections
- data sets
- compression rate
- metabolic pathways