Diacritics restoration based on word n-grams for Slovak texts.
Stefan TothEmanuel ZaymusMichal DuracíkPatrik HrkútMatej MeskoPublished in: Open Comput. Sci. (2021)
Keyphrases
- n gram
- language model
- language independent
- weather forecast
- bag of words
- language modelling
- language modeling
- text classification
- word segmentation
- variable length
- viterbi algorithm
- part of speech
- web documents
- natural language
- text documents
- neural network
- character n grams
- inside outside algorithm
- word level
- keywords
- machine learning
- data mining