Generating Bags of Words from the Sums of Their Word Embeddings.
Lyndon WhiteRoberto TogneriWei LiuMohammed BennamounPublished in: CICLing (1) (2016)
Keyphrases
- n gram
- related words
- english words
- word sense disambiguation
- unknown words
- word recognition
- word meaning
- word pairs
- word frequencies
- word segmentation
- automatically generating
- text corpus
- multiword
- linguistic information
- word similarity
- syntactic categories
- stop words
- chinese word segmentation
- noun phrases
- lexical information
- linguistic knowledge
- word spotting
- word co occurrence
- natural language text
- co occurrence
- frequency counts
- query words
- vector space
- keywords
- spoken document retrieval
- lexical features
- punctuation marks
- handwritten words
- word meanings
- word order
- word level
- training corpus
- latent topics
- distributional clustering
- compound words
- text corpora
- word sense
- low dimensional
- dimensionality reduction
- numeral strings
- parallel corpus
- speech recognition systems
- out of vocabulary
- translation model
- printed text
- word frequency
- short list
- multiple instance learning
- feature vectors