A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-n-Grams, and LSTMs.
Sanjeev AroraMikhail KhodakNikunj SaunshiKiran VodrahalliPublished in: ICLR (Poster) (2018)
Keyphrases
- n gram
- compressed sensing
- bag of words
- image reconstruction
- character n grams
- text documents
- language model
- random projections
- variable length
- sparse representation
- web documents
- text classification
- supervised learning
- unsupervised learning
- part of speech
- natural images
- language specific
- text mining
- information retrieval
- signal processing
- dimensionality reduction
- text retrieval
- low dimensional
- image representation
- image classification
- semi supervised
- fourier domain
- cross language
- machine learning
- feature extraction
- high quality
- document clustering
- multiscale
- keywords
- data points
- text categorization
- data sets