Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models.
Mohammad Taher PilehvarDimitri KartsaklisVictor ProkhorovNigel CollierPublished in: CoRR (2018)
Keyphrases
- co occurrence
- latent topic models
- word sense disambiguation
- word counts
- model construction
- data structure
- image representation
- word recognition
- statistical models
- n gram
- probabilistic model
- keywords
- machine learning
- wordnet
- neural network
- multiscale
- information retrieval
- text corpus
- geometric models
- outdoor images
- data mining
- real world