Feature Selection with Pretrained-BERT for Hate Speech and Offensive Content Identification in English and Hindi Languages.
Surya AgustianReski SaputraAidil FadhilahPublished in: FIRE (Working Notes) (2021)
Keyphrases
- spoken language
- language identification
- english text
- feature selection
- speaker identification
- indian languages
- cross lingual
- target language
- multi lingual
- statistical machine translation
- machine translation
- comparable corpora
- source language
- speech recognition
- feature extraction
- query translation
- dialogue system
- language independent
- text to speech
- spontaneous speech
- speech signal
- natural language generation
- text categorization
- text classification
- broadcast news
- language specific
- multimedia
- metadata
- native language
- noisy environments
- document images
- machine translation system
- cross language information retrieval
- cross language
- arabic language
- bilingual dictionaries
- optical character recognition
- word order
- language model
- feature set
- support vector
- multilingual information retrieval