Searching the PDF Haystack: Automated Knowledge Discovery in Scanned EHR Documents.
Alex Kostrinsky-ThomasFuki M. HisamaThomas H. PaynePublished in: Appl. Clin. Inform. (2021)
Keyphrases
- knowledge discovery
- unstructured documents
- pdf documents
- pdf files
- information retrieval
- web documents
- scanned documents
- text documents
- document repositories
- intelligent search
- xml documents
- document classification
- data mining
- formal concept analysis
- probability density function
- document retrieval
- document collections
- information retrieval systems
- text mining
- machine learning
- scanned images
- relevant documents
- document clustering
- metadata
- association rules
- probability distribution function
- document analysis
- effective retrieval
- information extraction
- databases
- electronic health records
- data mining techniques
- web mining
- document images
- electronic documents
- medical records
- vector space model
- query expansion