Method Description for CCKS 2021 Task 3: A Classification Approach of Scholar Structured Information Extraction from HTML Web Pages.
Haishun NanWanshun WeiPublished in: CCKS (Evaluation Track) (2021)
Keyphrases
- information extraction
- classification method
- web pages
- classification accuracy
- support vector machine svm
- support vector machine
- benchmark data sets
- classification algorithm
- detection method
- preprocessing
- clustering method
- pattern classification
- feature extraction
- classification scheme
- machine learning methods
- cross validation
- web browser
- machine learning
- web page classification
- web documents
- training samples
- probabilistic model
- pattern recognition
- support vector
- user interface
- feature vectors
- natural language
- objective function
- website
- feature selection