FaDA: Fast Document Aligner using Word Embedding.
Pintu LoharDebasis GangulyHaithem AfliAndy WayGareth J. F. JonesPublished in: Prague Bull. Math. Linguistics (2016)
Keyphrases
- keywords
- latent topics
- co occurrence
- related words
- text corpus
- noun phrases
- compound words
- document collections
- information retrieval systems
- vector space
- document space
- spoken document retrieval
- word level
- document images
- document classification
- short list
- document image retrieval
- word co occurrence
- index terms
- term frequency
- n gram
- vector space model
- web documents
- retrieval systems
- word frequency
- printed documents
- document retrieval
- text documents
- word clouds
- semantic information
- relevant documents
- document clustering
- word sense disambiguation
- document level
- word recognition
- document representation
- document similarity
- concept space
- term weighting
- topic models
- information retrieval