Unsupervised language identification based on Latent Dirichlet Allocation.
Wei ZhangRobert A. J. ClarkYongyuan WangWen LiPublished in: Comput. Speech Lang. (2016)
Keyphrases
- language identification
- latent dirichlet allocation
- topic modeling
- topic models
- lda model
- topic discovery
- probabilistic topic models
- generative model
- hierarchical bayesian models
- document images
- speaker identification
- gibbs sampling
- latent topics
- text mining
- semi supervised
- supervised learning
- unsupervised manner
- search engine
- unsupervised learning
- probabilistic model
- hidden markov models
- bag of words
- visual features
- collaborative filtering
- similarity measure
- image processing
- feature selection
- computer vision
- machine learning