CED: Catalog Extraction from Documents.
Tong ZhuGuoliang ZhangZechang LiZijian YuJunfei RenMengsong WuZhefeng WangBaoxing HuaiPingfu ChaoWenliang ChenPublished in: CoRR (2023)
Keyphrases
- document collections
- information retrieval systems
- text documents
- relevant documents
- legal documents
- automatic extraction
- document clustering
- web documents
- xml documents
- metadata
- information retrieval
- document classification
- free text
- database
- digital documents
- electronic documents
- web data
- document retrieval
- relational databases
- text categorization
- feature selection
- vector space model
- document representation
- textual content
- document content
- retrieval systems
- information extraction