Classifying Documents by Viewpoint Using Word2Vec and Support Vector Machines.
Jeffrey HarwellYan LiPublished in: NLDB (2022)
Keyphrases
- viewpoint
- support vector
- word spotting
- large margin classifiers
- learning machines
- word frequencies
- keywords
- information retrieval systems
- document collections
- related words
- index terms
- printed documents
- information retrieval
- text corpus
- linguistic information
- word pairs
- word frequency
- natural language text
- sentence similarity
- document retrieval
- related documents
- concept space
- latent topics
- term frequency
- document space
- text documents
- support vector machine
- xml documents
- feature selection
- word recognition
- multiword
- kernel function
- web documents
- training corpus
- pre classified
- spoken documents
- document analysis
- word similarity
- document clustering
- relevant documents
- sentence level
- multi document summarization
- term weighting
- document classification
- character recognition
- historical documents
- co occurrence
- metadata
- statistical topic models