Document Embedding for Scientific Articles: Efficacy of Word Embeddings vs TFIDF.
Harm Jan (Arjan) MeijerJ. TruongReza KarimiPublished in: CoRR (2021)
Keyphrases
- scientific articles
- vector space
- term frequency
- term weighting
- topic modeling
- retrieval model
- document classification
- document representation
- text categorization
- text classification
- document frequency
- vector space model
- scientific literature
- latent semantic indexing
- tf idf
- keywords
- information retrieval
- term weighting schemes
- manifold learning
- co occurrence
- document images
- text documents
- similarity search
- document collections
- retrieval systems
- bag of words
- text mining
- n gram