Bayesian Noisy Word Clustering through Sampling Prototypical Words.
Tadahiro TaniguchiYuta FukusakoToshiaki TakanoPublished in: ICDL-EPIROB (2018)
Keyphrases
- distributional clustering
- n gram
- english words
- related words
- word recognition
- word meaning
- word pairs
- word sense disambiguation
- information theoretic
- unknown words
- clustering algorithm
- clustering method
- multiword
- text categorization
- linguistic information
- word segmentation
- word sense
- k means
- word frequencies
- linguistic knowledge
- word spotting
- latent topics
- text classification
- keywords
- text corpus
- random sampling
- query words
- stop words
- translation model
- noun phrases
- document clustering
- spoken document retrieval
- bayesian networks
- word meanings
- syntactic categories
- dirichlet process mixture models
- word co occurrence
- chinese word segmentation
- word similarity
- handwritten words
- noisy environments
- language model
- lexical features
- word frequency
- wordnet
- automatic transcription