PDF-to-Text Reanalysis for Linguistic Data Mining.
Michael Wayne GoodmanRyan GeorgiFei XiaPublished in: LREC (2018)
Keyphrases
- data mining
- text mining
- linguistic analysis
- linguistic information
- language generation
- pdf files
- natural language text
- probability density function
- data analysis
- data mining techniques
- knowledge discovery
- keywords
- data mining methods
- text generation
- linguistic patterns
- information retrieval
- database
- syntactic analysis
- natural language processing
- syntactic structures
- association rules
- rough sets
- data mining algorithms
- text retrieval
- free text
- real world
- web mining
- web documents
- data warehouse
- machine learning
- data mining applications
- textual data
- privacy preserving
- semantic representations
- information extraction
- computer science
- lexical semantics
- natural language
- bayesian networks