PDF text classification to leverage information extraction from publication reports.
Duy Duc An BuiGuilherme Del FiolSiddhartha JonnalagaddaPublished in: J. Biomed. Informatics (2016)
Keyphrases
- text classification
- information extraction
- text mining
- machine learning
- text documents
- text categorization
- probability density function
- natural language processing
- feature selection
- bag of words
- information retrieval
- structured data
- free text
- precision and recall
- naive bayes
- digital libraries
- semi structured
- document classification
- data cleaning
- sentiment analysis
- n gram
- text classifiers
- text data
- natural language
- mixture model
- named entities
- semantic features
- conditional random fields
- question answering
- textual data
- knn
- labeled data
- pdf files
- named entity recognition
- artificial intelligence
- web mining
- neural network
- web documents
- machine translation
- density function
- data analysis
- multi label
- probabilistic model
- classification accuracy