Combining Large-Scale Unlabeled Corpus and Lexicon for Chinese Polysemous Word Similarity Computation.
Huiwei ZhouChen JiaYunlong YangShixian NingYingyu LinDegen HuangPublished in: CCIR (2017)
Keyphrases
- similarity computation
- unknown words
- word sense
- word sense disambiguation
- word segmentation
- similarity measure
- word pairs
- collaborative filtering
- cosine similarity
- multiword
- co occurrence
- similarity metrics
- structural similarity
- part of speech
- handwritten words
- tf idf
- wordnet
- training data
- n gram
- semi supervised learning
- natural language
- active learning
- similarity search
- labeled data
- statistical machine translation
- semantic similarity
- unlabeled data
- similarity join
- information retrieval
- training set
- information extraction
- text classification
- cross lingual