Unsupervised Low-Dimensional Vector Representations for Words, Phrases and Text that are Transparent, Scalable, and produce Similarity Metrics that are Complementary to Neural Embeddings.
Neil R. SmalheiserGary BonifieldPublished in: CoRR (2018)
Keyphrases
- low dimensional
- similarity metrics
- keywords
- syntactic categories
- multiword
- vector space
- word pairs
- noun phrases
- high dimensional
- proper names
- dimensionality reduction
- similarity measure
- similarity metric
- linguistic information
- word frequency
- euclidean space
- high dimensional data
- text documents
- manifold learning
- principal component analysis
- data points
- syntactic information
- lexical semantics
- similarity computation
- semantic relations
- semantic similarity
- feature space
- document representation
- similarity measurement
- compound words
- part of speech
- feature vectors
- n gram
- text corpora
- semantic information
- text mining
- semi supervised
- low dimensional spaces
- latent space
- distance measure
- natural language processing
- face recognition
- information retrieval