Web content topic modeling using LDA and HTML tags.
Hamza H. M. AltarturiMuntadher SaadoonNor Badrul AnuarPublished in: PeerJ Comput. Sci. (2023)
Keyphrases
- web content
- topic modeling
- topic models
- latent dirichlet allocation
- website
- web pages
- text mining
- user generated
- web documents
- topic extraction
- latent topics
- latent semantic analysis
- semantic browsing
- text documents
- text classification
- social media
- generative model
- lda model
- variational inference
- pattern recognition
- knowledge base
- prior knowledge
- co occurrence
- text corpora
- gibbs sampling
- probabilistic latent semantic analysis
- probabilistic model
- probabilistic topic models
- support vector
- latent variables