Document Similarity for Texts of Varying Lengths via Hidden Topics.
Hongyu GongTarek SakakiniSuma BhatJinjun XiongPublished in: ACL (1) (2018)
Keyphrases
- document similarity
- latent dirichlet allocation
- text documents
- document clustering
- topic models
- document representation
- graph theory
- keywords
- cosine similarity
- word similarity
- semantic similarity
- generative model
- relevance model
- information retrieval
- clustering method
- index terms
- text mining
- vector space model
- distance measure