Topic Modeling and Word Sense Disambiguation on the Ancora corpus.
Rubén IzquierdoMarten PostmaPiek VossenPublished in: Proces. del Leng. Natural (2015)
Keyphrases
- topic modeling
- word sense disambiguation
- word sense
- topic models
- text corpora
- wordnet
- natural language processing
- latent dirichlet allocation
- text mining
- co occurrence
- machine translation
- text documents
- ambiguous words
- text classification
- information extraction
- collaborative filtering
- latent variables
- cross lingual
- machine learning
- multiword
- text processing
- semantic similarity
- feature set
- probabilistic model
- part of speech
- prior knowledge
- natural language
- computational linguistics
- learning algorithm
- databases