Predicting Embedding Reliability in Low-Resource Settings Using Corpus Similarity Measures.
Jonathan DunnHaipeng LiDamian SastrePublished in: LREC (2022)
Keyphrases
- similarity measure
- resource allocation
- euclidean distance
- manually annotated
- high levels
- keywords
- feature vectors
- data sets
- mutual information
- distance measure
- graph embedding
- text corpora
- multidimensional scaling
- open domain
- robust image watermarking
- reliability analysis
- highly reliable
- nonlinear dimensionality reduction
- resource consumption
- semantic similarity
- similarity function
- vector space
- similarity search
- principal component analysis