SART - Similarity, Analogies, and Relatedness for Tatar Language: New Benchmark Datasets for Word Embeddings Evaluation.
Albina KhusainovaAdil KhanAdín Ramírez RiveraPublished in: CICLing (1) (2019)
Keyphrases
- benchmark datasets
- co occurrence
- uci repository
- distance measure
- word similarity
- semantic similarity
- evaluation measures
- uci machine learning repository
- ensemble methods
- english text
- similarity measure
- data sets
- computing semantic relatedness
- parallel corpus
- euclidean distance
- machine learning
- binary codes
- word sense disambiguation
- terms of classification accuracy
- n gram
- lexical information
- prediction accuracy