New word analogy corpus for exploring embeddings of Czech words.
Lukás SvobodaTomás BrychcínPublished in: CoRR (2016)
Keyphrases
- english words
- word frequencies
- unknown words
- word pairs
- multiword
- text corpus
- word co occurrence
- word sense
- parallel corpus
- linguistic information
- lexical features
- related words
- language independent
- training corpus
- noun modifier
- word sense disambiguation
- n gram
- noun phrases
- co occurrence
- word segmentation
- stop words
- semantic relations
- word recognition
- ambiguous words
- word frequency
- part of speech
- natural language text
- spontaneous speech
- text corpora
- word meaning
- semantic similarity
- keyword extraction
- wordnet
- cross language
- spoken document retrieval
- lexical information
- syntactic categories
- word similarity
- keywords
- linguistic knowledge
- conversational speech
- translation model
- topic models
- vector space
- world knowledge