Using Word Embeddings to Deter Intellectual Property Theft through Automated Generation of Fake Documents.
Almas AbdibayevDongkai ChenHaipeng ChenDeepti PoluruV. S. SubrahmanianPublished in: ACM Trans. Manag. Inf. Syst. (2021)
Keyphrases
- intellectual property
- patent documents
- word frequencies
- vector space
- word spotting
- natural language text
- patent search
- keywords
- text corpus
- word frequency
- relevant documents
- related words
- multiword
- document collections
- clef ip
- information retrieval
- latent topics
- printed documents
- web documents
- linguistic information
- word pairs
- document clustering
- document retrieval
- page layout
- document analysis
- information retrieval systems
- co occurrence
- stop words
- patent information
- related documents
- spoken documents
- metadata
- n gram
- xml documents
- handwritten documents
- word recognition
- sentence level
- vector space model
- ranked list
- e government
- text mining
- word segmentation
- patent retrieval
- text documents
- user queries
- low dimensional