AURORA: An Information Extraction System of Domain-specific Business Documents with Limited Data.
Minh-Tien NguyenLe Tien DungLe Thai LinhNguyen Hong SonDo Hoang Thai DuongBui Cong MinhNguyen Hai PhongNguyen Huu HiepPublished in: CIKM (2020)
Keyphrases
- data sets
- domain specific
- information extraction
- data analysis
- data quality
- data collection
- database
- xml documents
- data points
- information retrieval
- web documents
- big data
- training data
- data sources
- general purpose
- information retrieval systems
- data processing
- machine learning
- natural language text
- textual data
- unstructured documents
- data objects
- raw data
- knowledge management
- decision making
- information systems