Normalization of Non-Standard Words in Croatian Texts.
Slobodan BeligaMiran PobarSanda Martincic-IpsicPublished in: CoRR (2015)
Keyphrases
- english words
- keywords
- text documents
- natural language text
- linguistic analysis
- chinese texts
- human generated
- text corpus
- linguistic information
- world knowledge
- punctuation marks
- data sets
- legal texts
- textual features
- syntactic structures
- training corpus
- noun phrases
- n gram
- information extraction
- machine learning
- neural network