Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks.
Nandan ThakurNils ReimersJohannes DaxenbergerIryna GurevychPublished in: NAACL-HLT (2021)
Keyphrases
- pairwise
- synthetic data
- data sets
- prior knowledge
- missing data
- similarity measure
- input data
- noisy data
- missing values
- detection method
- information loss
- preprocessing
- statistical methods
- significant improvement
- test data
- data collection
- big data
- database
- similarity matrix
- clustering method
- semi supervised
- image data
- high dimensional
- feature set
- natural language processing
- high dimensional data
- support vector machine
- data points
- computational complexity
- similarity function
- data structure
- bayesian networks
- statistical significance
- clustering algorithm