Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets.
Hala MulkiChedi Bechikh AliHatem HaddadIsmail BabaogluPublished in: SemEval@NAACL-HLT (2019)
Keyphrases
- n gram
- language independent
- language model
- multi lingual
- language specific
- finite state transducers
- bag of words
- speech recognition
- language modeling
- text classification
- variable length
- language modelling
- cross lingual
- part of speech
- viterbi algorithm
- out of vocabulary
- word segmentation
- vector space
- social media
- statistical language modeling
- neural network
- inside outside algorithm
- parallel corpora
- news articles
- digital libraries
- search engine
- information retrieval