LSHvec: a vector representation of DNA sequences using locality sensitive hashing and FastText word embeddings.
Lizhen ShiBo ChenPublished in: BCB (2021)
Keyphrases
- dna sequences
- locality sensitive hashing
- vector representation
- binary codes
- similarity search
- document representation
- vector space
- nearest neighbor
- brute force
- distance measure
- nearest neighbor search
- similarity measure
- hash functions
- sift features
- indexing techniques
- metric space
- knn
- co occurrence
- hamming distance
- low dimensional
- range queries
- n gram
- feature space
- keywords
- high dimensional data
- information retrieval systems
- probabilistic model
- distance computation