Enhancing Tokenization by Embedding Romanian Language Specific Morphology.
Mihaela Alexandra VasiuRodica PotoleaPublished in: ICCP (2020)
Keyphrases
- query expansion
- language specific
- character n grams
- language neutral
- n gram
- language model
- language independent
- language modeling
- cross lingual
- natural language
- text retrieval
- machine translation
- out of vocabulary
- labor intensive
- specific features
- vector space
- cross language information retrieval
- image analysis
- variable length
- cross language
- named entities
- error prone
- text classification
- semi automatic
- machine learning
- word sense disambiguation
- knowledge representation