From Word Embeddings To Document Distances.
Matt J. KusnerYu SunNicholas I. KolkinKilian Q. WeinbergerPublished in: ICML (2015)
Keyphrases
- text corpus
- latent topics
- keywords
- distance measure
- related words
- co occurrence
- compound words
- information retrieval
- word co occurrence
- term frequency
- term weighting
- noun phrases
- word level
- word clouds
- document retrieval
- document images
- document image retrieval
- retrieval systems
- relevant documents
- spoken document retrieval
- document space
- n gram
- web documents
- document collections
- query words
- word segmentation
- tf idf
- low dimensional
- document representation
- related documents
- search engine
- information retrieval systems
- distance function
- vector space
- word frequency
- keyword extraction
- printed documents
- handwritten documents
- document clustering
- manifold learning
- word recognition
- euclidean space
- vector space model
- word sense disambiguation
- multiword
- euclidean distance
- text categorization
- document classification