A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words.
Nathan HartmannLucas AvançoPedro Paulo Balage FilhoMagali Sanches DuranMaria das Graças Volpe NunesThiago Alexandre Salgueiro PardoSandra M. AluísioPublished in: LREC (2014)
Keyphrases
- out of vocabulary
- product reviews
- parallel corpora
- hand crafted
- sentiment analysis
- language model
- n gram
- sentence level
- sentiment classification
- word segmentation
- cross language information retrieval
- opinion mining
- cross lingual
- cross language
- broadcast news
- named entity recognition
- parallel corpus
- linguistic features
- word pairs
- machine translation
- word level
- statistical machine translation
- natural language processing
- language independent
- named entities
- domain independent
- information sources
- text classification
- document retrieval
- query translation
- query terms
- wordnet
- user generated content
- information retrieval
- language modeling
- information extraction
- probabilistic model