Improving word embeddings in Portuguese: increasing accuracy while reducing the size of the corpus.
José Pedro PintoPaula VianaInês N. TeixeiraMaria T. AndradePublished in: PeerJ Comput. Sci. (2022)
Keyphrases
- high accuracy
- text corpus
- computational complexity
- word pairs
- vector space
- error rate
- word frequencies
- computational cost
- english words
- pos tagging
- euclidean space
- penn treebank
- n gram
- statistical machine translation
- negatively affect
- recognizing textual entailment
- feature selection
- machine translation system
- document level
- multiword
- word sense disambiguation
- machine translation