Summarizing large text collection using topic modeling and clustering based on MapReduce framework.
Naresh Kumar NagwaniPublished in: J. Big Data (2015)
Keyphrases
- topic modeling
- text collections
- mapreduce framework
- text documents
- topic models
- cloud computing
- text mining
- text categorization
- information retrieval
- text classification
- textual data
- document collections
- latent dirichlet allocation
- text retrieval
- information extraction
- data management
- keywords
- named entities
- probabilistic model
- co occurrence
- collaborative filtering
- bag of words
- support vector
- neural network
- document retrieval
- generative model
- inverted index
- database systems
- feature selection
- data analysis