Probabilistic Management of OCR Data using an RDBMS.
Arun KumarChristopher RéPublished in: Proc. VLDB Endow. (2011)
Keyphrases
- data sets
- data processing
- synthetic data
- uncertain data
- raw data
- data collection
- data analysis
- management system
- high quality
- prior knowledge
- probability distribution
- data points
- data structure
- database
- data mining algorithms
- information systems
- optical character recognition
- historical data
- databases
- information management
- data quality
- original data
- data distribution
- missing data
- database systems
- training data
- knowledge discovery