One Size Does Not Fit All: Finding the Optimal N-gram Sizes for FastText Models across Languages.
Vít NovotnýEniafe Festus AyetiranDávid LuptákMichal StefánikPetr SojkaPublished in: CoRR (2021)
Keyphrases
- n gram
- language independent
- language model
- language modelling
- character n grams
- variable length
- probabilistic model
- text classification
- finite state transducers
- language specific
- bag of words
- artificial intelligence
- information retrieval
- text mining
- language modeling
- cross lingual
- data analysis
- word segmentation
- machine learning
- neural network
- databases