One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations.
Jesper Brink AndersenMikkel Bak BertelsenMikkel Hørby SchouManuel R. CiosiciIra AssentPublished in: Eval4NLP (2020)
Keyphrases
- n gram
- related words
- english words
- word recognition
- word sense disambiguation
- word pairs
- unknown words
- word segmentation
- word frequencies
- word meaning
- text corpus
- multiword
- word similarity
- chinese word segmentation
- contextual information
- linguistic information
- lexical information
- keywords
- syntactic categories
- word co occurrence
- noun phrases
- linguistic knowledge
- co occurrence
- query words
- word spotting
- out of vocabulary
- stop words
- lexical features
- context sensitive
- handwritten words
- outlier detection
- frequency counts
- natural language
- historical manuscripts
- printed text
- spoken document retrieval
- word level
- natural language text
- word frequency
- punctuation marks
- compound words
- language independent
- translation model
- language specific
- handwriting recognition
- word sense
- speech recognition systems
- training corpus
- cross language information retrieval
- context dependent
- text documents
- text classification
- word meanings
- natural language processing
- word order
- indian languages